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The country of the address indicated in this Box is the applicant's State (i.e. 
country) of residence if no State of residence is indicated below; 



(ii) if, in Box No. II or in any of the sub-boxes of Box No. Ill, 
the indication "the States indicated in the Supplemental 
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the inventor or the inventor/applicant is not inventor for 
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(iv) if, in addition to the agent(s) indicated in Box No. IV, 
there are further agents: 

(v) if, in Box No. V, the name of any State (or OAPI) is 
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"certificate of addition," or if, in Box No. V, the name of 
the United States of America is accompanied by an 
indication "Continuation" or "Continuation-in-part": 



(vi) if there are more than three eariier applications whose 
priority is claimed: 



2. If the applicant claims, in respect of any designated 
Office, the benefits of provisions of the national law 
concerning non-prejudicial disclosures or exceptions to 
lack of novelty: 



in such case, write "Continuation of Box No. It" or "Continuation of Box No. 
Ill" or "Continuation of Boxes No. II and III" (as the case may be), indicate 
the name of the applicant(s) involved and, next to (each) such name, 
State(s) (and/or, where applicable, ARIPO, Eurasian, European or OAPI 
patent) for the purposes of which the named person is applicant; 

in such case, write "Continuation of Box No. H" or "Continuation of Box No. 
Ill" or "Continuation of Boxes No. II and III" (as the case may be), indicate 
the name of the inventor(s) and, next to (each) such name, State(s) (and/or, 
where applicable, ARIPO, Eurasian, European or OAPI patent) for the 
purposes of which the named person is inventor; 

in such case, write "Continuation of Box No. IV and indicate for each further 
agent the same type of information as required in Box No. IV; 

in such case, write "Continuation of Box No. V" and the name of each State 
involved (or OAPI), and after the name of each such State (or OAPI), the 
number of the parent title or parent application and the date of grant of the 
parent title or filing of the parent application; 



in such case, write "Continuation of Box No. VI" and indicate for each 
additional eariier application the same type of information as required in 
Box No. VI. 
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Disclosures or Exceptions to Lack of Novelty'* and furnish that 
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Continuation of Box No. IV 



PURVIS, William Michael Cameron 
COTTER, Ivan John 
PILCH, Adam John Michael 
CRISP, David Nomian 
ROBINSON, Nigel Alexander Julian 
HARRIS, Ian Richard 
HARDING, Charles Thomas 
TURNER, James Arthur 
MALLALIEU, Catherine Louise 
PRATT, Richard Wilson 
PRICE, Paul Anthony King 
HOLMES, Miles 
HORNER, David Richard 
NACHSHEN, Neil 
POTTER, Julian 
HAINES, Miles 



Form PCT/RO/1 01 (supplemental sheet) (January 1997; reprint January 1998) 



See Notes to the request form 



Sheet No. ^ 5 



PCT/GB 9 9 / 0 2 6 5 8 







Filing Date 


Number of 
earlier application 


Where earlier application is: 




of earlier application 
(day/month/year) 


national application: 
country 


regional application: * 
regional Office 


international application: 
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Box No. VI PRIORITY CLAIM 



n 



Further priority claims are indicated in the Supplemental Box 



The priority of the following eariier application(s) is hereby claimed: 



The receiving Office is hereby requested to prepare and transmit to the International Bureau a certified copy of 
the earlier application(s) (on/y if the earlier application was filed with the Office which for the purposes of the 
present international application is the receiving Office) identified above as item(s) : (1 ). (2) 



* Where the earlier application is an ARIPO application, it is mandatory to indicate in the Supplemental Box at ieast one country party to the Paris Convention for 
the Protection of Industrial Property for which that earlier application was filed (Rule 4. 10(b}(ii)}. See Supplemental Box. 



Box No. VII INTERNATIONAL SEARCHING AUTHORITY 



Choice of International Searching Authority (ISA) 

(If two or more International Searching Authorities are 
competent to carry out the international search, indicate the 
Authority chosen; the two-letter code may be used): 

ISA / EPO 



Request to use results of earlier search; reference to that search (if an earlier 
search has been carried out by or requested from the International Searching 
Authority): 

Country (or regional Office): 



Date (day/month/year) 



Number: 



Box No. VII CHECK LIST; LANGUAGE OF FILING 



This international application contains the 
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claims 

abstract 

drawings 
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This international application is accompanied by the item(s) marked below: 

1. ^ fee calculation sheet 

2. separate signed power of attorney 

3. Q copy of general power of attorney; reference number, if any: 

4. Q statement explaining lack of signature 

5. Q priority documents(s) identified in Box No. VI as item(s): 

6. Q translation of international application into (language): 

7. Q separate indications concerning deposited microorganism or other biological material 

8. Q nucleotide and/or amino acid sequence listing in computer readable form 

9. [n other (specify): 



Figure of the drawings which 
should accompany the abstract: 


Language of filing of the ENGLISH 
international application: 


Box No. IX SIGNATURE OF APPLICANT OR AGENT 



Next to each signature, indicate the name of the person signing and the capacity in which the person signs (if such capacity is not obvious from reading the request) 
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MASCHIO. Antonio 
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Date of actual receipt of the purported ^ _ 
international application: J 2 
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3. 


Corrected date of actual receipt due to later but 
timely received papers or drawings completing 
the purported international application: 




1 vf received: 


4. 


Date of timely receipt of the required 
corrections under PCT Article 1 1 (2): 




! not received: 


5. 


International Searching Authority \SA I 
specified by the applicant: 


1 6/ 1 — j Transmittal of search copy delayed 
] ' — ' luntil search fee paid 





Date of receipt of the record copy by 
the International Bureau: 
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Gene 

The present invention relates to a gene which is involved in the control of obesity and 
fertility. In particular, the gene disclosed herein is involved in late-onset obesity in males, 
which is coupled with infertility. Moreover, the invention relates to animal models for 
late-onset obesity. 

Obesity, which differs from being overweight by being characterised by an increase in the 
proportion of body fat present as opposed to a mere increase in body weight, is one of the 
major contributors to chronic disease development. Mortality in overweight males (5- 
15% overweight) increases to 125%, but rises up to 500% in obese males. Laboratory and 
epidemiological studies have also shown that mortality amongst obese males aged 
between 25 and 34 can increase up to twelve times. This increase in mortality is caused by 
the multitude of health risks associated with obesity, including cardiovascular disease, 
hypertension, diabetes, sleep apnoea (the abnormal ceasing of breathing during sleep), 
hemias, flat feet, arthritis, osteoarthritis, some cancers, varicose vekis, gout, respiratory 
problems, gall bladder disease and liver disease. The more serious complaints include: 

• Cardiovascular Disease - Obesity is an important factor in cardiovascular disease in 
both increasing blood cholesterol and blood pressure, and has been shown to increase 
the risk of disease by up to three times. Obesity also increases the work of the heart - 
cardiac volume, stoke volume and blood volume must all increase to cope with the 
increased weight. The detrimental effects of obesity on cardiovascular disease are 
reversible with weight loss. 

• Diabetes - Excess weight also increases the chance of acquiring diabetes mellitus by 
threefold and increases the risk of dying from diabetes by up to eight times. The 
mechanism by which an increase in body fat increases the risk of diabetes is largely 
unknown. However, it is postulated that a slight increase in circulating serum glucose 
or L-leucine may increase the basal levels of insulin. This rise in insulin is then 
associated with resistance to kisulin, caused both by decreased intracellular effects of 
insulin and a reduction in insulin receptors in the cell. Obesity has also been proven to 
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alter pancreatic function and in susceptible individuals this may lead to the 
development of diabetes mellitus . 

• Cancer - cancer has also shown a significant association with obesity, but the 
mechanisms are not understood. One proposed explanation is that obesity alters 
hormone levels and this could influence cancer development. In males, obesity is 
associated with a greater risk of developing prostate and colorectal cancers. 

• Gall bladder Disease - obese males show a four times increase in the risk of 
developing gall bladder disease. 

• Endocrine Function - this is also modified by elevated fat levels. The Beta cells in the 
islets of Langehans are enlarged in obese people and glucose intolerance is also 
fi-equently inhibited. 

• Reproductive System - obesity in males also impairs the functioning of the 
reproductive system. 

• Growth Hormone - obesity impairs the release of growth hormone fi-om the pituitary 
gland. This problem is of particular importance in obese children whose growth may 
be impaired. It is fully reversible if weight is lost. 

Numerous genes, gene products and their receptors have been characterised in rodent 
models of obesity which bear mutations associated with different forms of obesity (Bray 
& York, 1979; Comuzzie & Allison, 1998). Most such spontaneous mutations are 
recessive, and include mutations affecting leptin and its receptors in such models as ob/ob 
and db/db mice, Zucker fa/fa rats, Koletsky (f) rats, OLETF rats, corpulent (cp) rats and 
their substrains or derivatives (Zhang et al, 1994; Tartaglia et al, 1995; Eda et al, 1996; 
Takaya et aL, 1996; Chen et al, 1996; Jamal et al 1997; Kahle et aL, 1997; Lee et al, 
1991 \ Moon & Friedman, 1997; Takiguchi et al, 1998). These phenotypes are thought to 
result from a disruption in leptin or its receptors or in CCK-A receptors, and affect the 
control of food intake or energy expenditure or metabolism, and disrupt the gonadotrophic 
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axis in females. 

There are numerous other candidate genes putatively involved in obesity, some of which 
have been recently been summarised by Comuzzie & Allison (1998; Table 1). These 
5 include tubby (tub), agouti , Nhl2, MCH, CRH , hypocretins or orexins, CART peptides, 
melanocortin-4 ligands, uncoupling proteins (UCPl-3), carboxypeptidase E, NPY, their 
related transcripts or homologues or their receptors (Coleman et al, 1990; Miller et al, 
1993; Good et al, 1991 \ Klebig et aL, 1995; Naggert et al, 1995; Olhnan et al, 1995; 
Kleyn et al, 1996; Richard, 1996; Qu et al, 1996; Fan et aL, 1997; Huszar et al, 1997; 
10 North et aL, 1997; Ohki-Hamazaki et aL, 1997; Graham et aL, 1997; Boss et aL, 1997; 
Vidal-Puig et aL, 1997; Millet et aL, 1997; Cool et aL, 1997; Kristensen et aL, 1998; De 
Lecea et aL, 1998; Sakurai et aL, 1998). These models exhibit some degree of sexual 
dimorphism, a slight delay in onset of obesity or a dominant pattem of inheritance, though 
none show all of these in combination. It is generally believed that obesity is due to the 
15 complex interaction of a number of different factors. 

The study of obesity and its effects on health requires suitable ammal models which can 
faithfully replicate the condition as seen in humans. None of the available models 
combines all of the symptoms of obesity. In particular, the symptoms of male pattem 
20 obesity, which include late onset, sterility and a concentration of fat around the abdomen, 
are not displayed by currently available models. There is therefore a need for an 
improved model for obesity, which displays more of the characteristics of obesity 
observed in human patients. 

25 Transgenesis is a well established technique for the introduction of DNA sequences into 
the mammalian genome, and has been used to insert endocrine genes in several species, 
predominantly in mice (Pahniter et aL, 1982; Bucchini et aL, 1986; McGrane et aL, 1988; 
Ho et aL, 1995), but also in other species (Hammer et aL, 1985; Pursel et aL, 1989), 
including rats (Mullins et aL, 1990, Zeng et aL, 1994, Chareau et aL, 1996; Flavell et aL, 

30 1996). The methods are well described (Hogan et aL, 1986, Chareau et aL, 1996) and 
usually involve the microinjection of cloned DNA fragments into the male pro-nuclei of 
eggs isolated from superovulated females. Such eggs are transferred into the oviduct of 
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pseudopregnant females (obtained by mating with vasectomized males) and carried to 
term. DNA extracted from tail clippings obtained from the progeny may be examined for 
the presence of specific transgene DNA. Depending on the integrity and stability of the 
DNA sequence, the number of integration sites and their location in the host genome, 
transgenes may become stably integrated in the host genome and transmitted to 
subsequent progeny. 

If promoter and enhancer sequences are present in the transgene, the transgene may show 
high levels of expression in the host animals and the products may induce an endocrine 
phenotype that would be expected from the hormone product. For example, 
overexpression of human growth hormone (hGH) using a variety of heterologous non- 
specific promoters iaduces variable degrees of growth stimulation in transgenic animals 
(Pahniter et al, 1982, 1983; Morello et al, 1986; Pursel et aL, 1989; Shanahan et al, 
1989; Stewart et al, 1992; Short et al, 1992). However, transgene expression levels often 
differ between different transgenic lines made with the same insert, and the tissue 
specificity may vary, being highly dependent on the size of the DNA insert, the number of 
copies of the insert, its integrity and its integration site(s) in host DNA (Lacy et al, 1983; 
Al-Shawi et al, 1990; Huber et al, 1994). Unexpected phenotypes may result, either as 
pathological consequences of inappropriate amounts of transgene product or its 
production in ectopic sites, and examples of this for hGH transgenes include 
glomerulosclerosis or female infertility in mice or rats (Bartke et al, 1988; Brem et ah, 
1989; Quaife et al, 1989; Ninomiya et al, 1994). Intentionally directed expression of a 
transgene to an ectopic site may also have a significant influence on the nature of the 
phenot>pe produced (Omitz et al, 1985; Baker et al, 1992). This is well exem^plified 
using hGH transgenes, since instead of an overgrowth phenotype, hGH can produce an 
opposite, dwarf phenotype in transgenic mice or rats when driven by a promoter that 
targets it to the central nervous system to induce negative feedback effects on the 
endogenous GH system (Hollingshead et al, 1989, Banerjee et aL, 1994; Szabo et al, 
1995,FIavell et al, 1996). 

Other examples of endocrine transgenes include those targeting the genes for oxytocin 
(OT) and vasopressin (A VP) (Russo et aL, 1988; Habener et al, 1989; Grant et aL, 
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1993a,b; Ang et a/., 1991, 1994; Murphy & Ho, 1995). These genes are expressed mainly 
in magnocellular neiorones of the supraoptic (SON) and paraventricular (PVN) nuclei of 
the hypothalamus (Vandesande et al., 1975; Young, 1992; Gainer & Wray, 1994), The 
expression of these hormonal peptides appears to be mutually exclusive, coexpression in 
the same neurone occurring only rarely (Kiyama et aL, 1990). A construct consisting of 
sequences 0.6 kb 5', 1.8 kb 3' and the entire structural gene of bovine OT dkected 
expression to the oxytocinergic cells of the SON and the PVN, but also to the lung and 
Sertoli cells of the testis hi transgenic mice (Ho et aL, 1995). The hypothalamic 
expression was also physiologically regulated with an increase ui the abundance of the 
transgene transcript occurring during dehydration. The Sertoli cells are a site of peripheral 
expression of the endogenous OT gene in cattle but not in mice or rats. In these 
transgenic mice, the testicular transcripts are translated and processed (Ang et aL, 1994), 
suggesting that this construct contained regulatory elements capable of recapitulate the 
bovine expression pattern of OT in the mouse testis (Ang et aL 1991). Foo et aL (1994) 
have identified a testis-specific promoter in the rat AVP gene. 

The AVP and OT genes are highly homologous in structure, and are transcribed in 
opposite orientations firom positions closely linked in the genome within a single locus 
(Sausville et aL, 1985; Young, 1992). It is therefore possible that elements in the 
flanking sequence of the OT gene normally interact with those present in the nearby 
homologous AVP gene to regulate their mutually exclusive expression (Young et aL, 
1990; Young, 1992). To test this theory, mice were generated bearing 1.25 kb of 5', 0.2 kb 
of 3' and the structural gene for bovine AVP fused, in same the same orientation as the 
endogenous genes, to the bovine OT transgene already described to show hypothalamic 
expression (Ho et aL, 1995). The resulting mice expressed the bovine OT transgene in the 
testis and lung, but lacked hypothalamic expression of this transgene and did not express 
the bovme AVP transgene. A further bovine OT transgene, including 3 kb of 5' sequence, 
the structural gene and 2.5 kb of downstream sequence was used in an attempt to 
overcome this repression, but no animals were generated, and it was argued that the 
region between 0.6 kb and 3 kb 5' of the bovine OT gene conferred a toxic effect in 
embryonic development which is usually repressed (Ho et aL, 1995). Transgenic rats have 
also been generated bearing fragments of the rat AVP gene with reporter genes inserted 
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into the third exon of the AVP gene. Transgenes containing 1 .5 kb and 3kb of 5', 0.2kb of 
3' and the rat AVP gene with a B-galactosidase reporter gene in the third exon also 
conferred expression to the testis. This was attributed to the presence of a ciyptic 
testicular promoter within the reporter gene (Zeng et al., 1994a). Clearly however, the use 
of small fragments of DNA containing OT or AVP sequences gives rise to unpredictable 
patterns of transgene expression. 

One theoiy is that this variation may be overcome if sufficiently large DNA constructs are 
used, containing regions of DNA known as locus control regions (LCRs) that can direct 
tissue specific, position independent, copy number dependent, physiologically 
appropriate, expression of the transgene in the host (Grosveld et al, 1987; Bonifer et al. 
1990; Huber ./., 1994; Fujiwara et al, 1997). Again this may be exemplified with' 
hGH transgenes. An LCR region for the hGH gene has been defined by Jones 
(1995). When a cosmid containing this sequence was used to generate several lines of 
transgenic mice, hGH was expressed in the pituitary gland in an appropriately regulated 
fashion, and the mice showed no overgrowth phenotype or other pathological 
consequences of overproduction of hGH. 
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Summary of the invpntinn 



A line of transgenic rats has been generated using a cosmid of rat DNA containing the 
genes for oxytocin (OT) and vasopressin (AVP), into which reporter genes were inserted 
namely hGH (Roskam et al, 1979) in the AVP gene and bovine OT mostly replacing the 
rat OT gene. To attempt to include LCR regions for this gene locus in our trans<^ene 
constructs, larger DNA fragments containing both OT and AVP genes and larger amoLts 
of flanking sequences were used, which were isolated from a rat cosmid library. One line 
of such rats, bearing at least 4 copies of this cosmid as a concatamer integrant, exhibits an 
unexpected and novel late onset obesity and infertility dominant phenotype that would not 
be predicted from the known DNA sequences present in this cosmid. This phenotype is 
30 clearly distinguishable from other obesity/infertility syndromes so far described. 
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Analysis of the cosmid sequences used in the transgene constructs reveals the presence of 
a previously unknown gene, which is responsible for the observed obesity phenotype. 



Accordingly, in the first aspect of the present invention there is provided a 5'OT-EST 
5 polypeptide having a sequence selected from the group comprising the sequences set 
forth in any one of SEQ. ID. Nos. 2, 4 or 6, and sequences substantially homologous to any 
one of the polypeptides set forth in SEQ. ID. Nos. 2, 4 or 6. 

La a second aspect, the invention provides a mutant of a 5'OT-EST polypeptide according to 
10 the first aspect of the invention which is capable, in vivo, of modulating the obesity of an 
animal expressing it. 

In a third aspect, the present invention provides a nucleic acid encoding a 5'OT-EST 
polypeptide or mutant 5'OT-EST polypeptide according to the first aspect of the 

15 invention. Advantageously, the nucleic acid has a sequence selected fi-om the group 
consisting of any one of SEQ. ID. Nos. 1, 3, 5 or 7; sequences which are hybridisable under 
stringent conditions with an ohgonucleotide comprising 20 contiguous bases fi-om any one 
of SEQ. ID. Nos. 1, 3, 5 or 7; sequences substantially homologous to any one of SEQ. ID. 
Nos. 1, 3, 5 or 7; and sequences complementary thereto. Stringent hybridisation conditions 

20 are preferably as defined below. 

In a fourth aspect, the invention provides diagnostic reagents for the detection of mutations, 
polymorphisms or other changes in 5 'OT-EST which may predispose an individual to 
obesity. For example, the invention provides probes usefiil for amplifying 5'OT-EST 
25 nucleic acids. 

In a fifth aspect, the invention provides a transgenic non-human animal expressing, as a 
result of transgene expression, a 5'OT-EST polypeptide or mutant 5'OT-EST polypeptide 
according to the invention. Transgenic animals according to this aspect of the invention 
30 are models for obesity in humans, and may be used for research into therapies and 
treatments which may be used to alleviate obesity. 
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Brief Description of the Drawings 

Figure 1 shows a partial restriction map of the rat AVP/OT locus, from cosmid cVOl, 
cV02 and cV03. Restriction sites for the enzymes listed are shown as a vertical marks. 
5 The known sequence of the rat A VP and OT genes is indicated by single dashed lines, 
and the sequence determined and disclosed herein of the rat 5 ' OT-EST g^ne is indicated 
by the double dashed line. Scale is approximate. 

Figure 2 is a diagrammatic representation of the subcloning steps leading to the 
10 insertion of the hGH reporter gene into the 5* untranslated region of the rat vasopressin 
gene. The final subclone shows the Cla 1 to Xho 1 fragment which was inserted into 
the construct used to make transgenic lines. 

Figure 3 is a diagrammatic representation of the subcloning steps required for the 
15 production of the rat-bovine hybrid gene which was inserted into the final construct. 
For simplicity, this is shown in reversed orientation compared to its orientation in the 
construct. 

Figure 4 shows the extent of the rat AVP/OT locus present in the cosmid cVOl, 2 and 3. 
20 These clones span a total of 44kb, including 8kb 5' of rAVP and 24kb 5' of rOT. The 
structure of the final cosmid construct CVOl 4 is illustrated and some restriction sites 
indicated. 

Figure 5 shows a point mutation in the cV014 construct in a conserved region 5' to the 
25 OT gene. A conserved G residue is substituted with an A residue in the construct. 

Figure 6 is an alignment of the sequences of 5'OT-EST from rat, human and mouse 
sources. 



30 



Figure 7 is a comparison of the body weights of transgenic and non-transgenic rats. 
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Figure 8 is a comparison of the body weights of transgenic and non-transgenic male and 
femnale rats. 

Figure 9 is a comparison of the measurements (in mm) of the pelvis (b) and of the body 
5 length (a) of 20 and 52 week male transgenic and non-transgenic rats (mean +/- sem, 
***=p<0.001, n=6-7 per group). 

Figure 10 illustrates the increased body weight/body length ratio of transgenic rats 
compared with non-transgenic rats. 

10 

Figure 11 shows the weights of the peri-renal and testicular fat pads in JP17 male 
transgenic and non-transgenic animals at different ages. (*=p<0.05; **=p<0.01, n=6-7 per 
group). 

15 Figure 12 shows the levels of plasma insulin, glucose, cholesterol, triglycerides, leptin and 
corticosterone in terminal blood samples from transgenic and non-transgenic rats. Values 
shown are mean of each group +/- SEM (n=6 for transgenic groups; n=4 for non- 
transgenic groups) (* Significantly different (p<0.05) from sex matched non transgenic 
group). 

20 

Figure 13 shows the changes in body weight, leptin levels and food intake associated with 
young (100 day old) rats fed on normal fat (4%) or high fat (30%) diets. Results are 
shown for SLOB, non-transgenic and dwarf rats, fed either 4% fat diet (clear bars) or a 
30% fat diet (stippled bars) over a 27-day period (* = p<0.05; ** = p<0.01; *** = 
25 p<0.001, high vs. low fat diet: ## = p<0.01; ### = p<0.001, SLOB vs. non-tmasgenic 
rats). 

Figure 14 shows the changes in body weight associated with ovariectomy in transgenic 
rats and non-transgenic littermates. ■ - ovariectomised SLOB rat; □ - ovariectomised 
30 wild-type rat; • - sham ovariectomised SLOB rat; o - sham ovariectomised wild-type rat. 
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Detailed Description of the Invention 
Definitions 

As referred to herein, "5'OT-EST" is the polypeptide represented in SEQ. ED. Nos. 2, 4 or 
6 (rat, human and mouse respectively). Preferably, it is the human sequence. However, the 
term also covers alternative peptides homologous to 5'OT-EST, such as polypeptides 
derived from other species, including other mammalian species. 

"Mutants" of 5'OT-EST include polypeptides which differ only in minor, insignificant 
ways from wild-type 5'OT-EST, for example polypeptides having conservative amino 
acid replacements or additions or deletions. Preferred, however, are mutants which are 
able to confer, on animals expressing them, an obese phenotype as defined herein. An 
example of such a mutant is the 5'OT-EST - xdel polypeptide set forth in SEQ. ID. No. 8. 
Further mutants may be obtained as described herein, and defined according to their 
functional effects in transgenic animals or host cells. 

"Substantially homologous", whether applied to polypeptide or nucleotide sequences, is 
as defined herein with reference to homology screening. It may be interpreted as referring 
either to sequence aligrmient and direct comparison, or to homology as defined by BLAST 
homology searching as defined herein. 

A "transgenic animal" is an animal whose genome has been functionally altered by 
genetic manipulation. In the context of the present invention, this includes animals 
bearing and expressmg a 5 'OT-EST or mutant 5 'OT-EST transgene, animals from which 
5 'OT'EST sequences have been deleted or in which they have been modified, and animals 
which are transiently transformed to express a (mutant) 5 'OT-EST transgene such as by 
transformation with viral sequences. 

"Transformation" refers to the functional insertion of a gene by nucleic acid transfer, or 
the functional deletion of a gene, in a cell or organism. The term thus includes 
transfection, transduction and any other techniques useful for transferring nucleic acids 
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into cells or organisms. Cells transformed according to the invention express a novel 
genotype as a result of the transformation. 

For the avoidance of doubt, unless otherwise required by the specific context, reference 
5 herein to an entity in the singular includes the plural thereof. Thus, the expressions "a 
gene" and "one or more genes" are equivalent. 

Moreover, unless otherwise required by context, references to 5'OT-EST (5'OT'EST) 
preferably mclude mutants of 5'OT-EST (5VT-EST), 

10 

A "cosmid" is a bacteriophage-based vector as commonly known in the art. 

References herein to "obesity" and obese animals are preferably references to the SLOB 
phenotype observed in SLOB rats according to the invention, characterised in being inter 
15 alia male-specific, late onset, with fat deposition concentrated in the abdominal area and 
associated with sterility. 

Description of Preferred Embodiments 

20 A cosmid (cV014) of rat DNA containing the rat vasopressin (A VP) and rat oxytocin 
(OT) genes (Ivell & Richter, 1984) was constructed, and DNA reporter sequences inserted 
therein using standard methods (Sambrook et al, 1989) as outlined in Examples 1 & 2 
below. Microinjection of the cV014 DNA insert into fertilised rat eggs and their transfer 
into pseudopregnant recipients resulted in production of viable offspring. Unexpectedly, 

25 the male founder rat with 4-5 copies of cV014 (JP17) showed a dominant phenotype of 
severe late-onset visceral obesity. This form of obesity shows (i) a very late onset, (ii) a 
highly selective visceral distribution of fat developing on a normal rodent diet, without 
hyperphagia, (iii) an effect greatly preponderant in males, (iv) a predisposition to 
excessive dietary-fat induced obesity at an early age, before the phenotype becomes 

30 apparent on a normal diet, and (v) a dominant pattem of inheritance. Moreover, male 
transgenics show severe infertility in males, whilst females are fertile. Rats bearing this 
transgene have been termed SLOB rats (for Severe Late-onset OBesity). The symptoms 
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of obesity observed in SLOB rats all occur in several forms of human obesity, including 
that associated v^ith human syndrome-X (Reaven et aL, 1988) for which a late-onset 
increase in abdominally distributed fat, affecting males much more severely than females 
(Gray et al 1997) may be mimicked in the SLOB rat. Obvious causes, such as leptin 
deficiency or insulin resistance or overt Type 1 or 2 diabetes may be excluded. 

Although the SLOB phenotype is preponderant in males, it may be markedly exacerbated 
in females by ovariectomy. 

Mapping and analysis of cosmid DNA used to generate cV014, revealed a putative gene, 
5' of the OT locus. A fragment of this DNA v^as subcloned and sequenced. Analysis of 
this region of rat DNA enabled us to determine the location, orientation, partial exon 
structure and predicted protein product of a novel gene lyiag 5' of the OT gene in rat 
DNA. Further sequencing and analysis elucidated the structure of this gene, and provided 
additional sequence information for the cosmid DNA surrounding the known sequence of 
the OT and AVP genes. The novel rat gene is termed herein 5 'OT-EST^ which encodes 
the 5'OT-EST polypeptide. The genomic sequence of 5 'OT-EST is given in SEQ. ID. No. 
16. 

A search of DNA and protein databases revealed no significant match to any known gene, 
but recognised partial matches to DNA sequences homologous to 5 'OT-EST in expressed 
sequence tag (EST) databases from rat, mouse and human DNA sources. These represent 
partial products of the rat gene, and of genes homologous to this novel rat gene, in mouse 
and human DNA. The predicted structures of four exons, termed w, x, y, z, and predicted 
protein sequences are highly conserved between these species. A partial match was noted 
to a human genomic DNA sequence alluded to, but not disclosed in White et al. PNAS 
95:305-309 (1998), but deposited by them in Genbank (Accession no:AF036329) as a 
putative genomic fragment containing the human GnRH-II gene. The relationship 
between human 5 'OT-EST and human GnRH-II as described by White et al is confirmed 
by the present work; however, there does not appear to be any such relationship in rats or 
mice. Homologous rat GnRH-II sequences cannot be recognised by sequence analysis in 
cV014, which contains more than lOkb of rat DNA flanking 5 'OT-EST. Neither can any 
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homologous mouse GnRH-II sequence be identified by hybridisation or PGR studies in 
multiple mouse genomic clones which contain 5'OT-EST and at least 50kb of flanking 
DNA. Thus, it appears highly unlikely that the GnRH~II sequence corresponds 
. functionally with 5 'OT-EST. 

5 

In rats, 5'OT-EST lies about lOkb downstream of the 3' exon of the protein tyrosine 
phosphatase receptor alpha (Ptprd) gene, and the intervening lOkb show no homology 
with GnRH-II. Additionally, mouse BAG clones containing 5'OT-EST show no 
homology to GnRH-IL Thus, GnRH-II sequences are not adjacent to 5 'OT-EST in rats or 
10 mice, and neither GnRH-II nor Ptpra is present in the cosmid used to generate SLOB 
rats. Complete sequencing of the cosmid reveals no other novel genes. 

Based on physical linkage to Ptpra, Avp and Oxt, 5 'OT-EST maps to the distal region of 
mouse chromosome 2, 7.32 cM from the centromere. Ptpra has itself been implicated in 
15 the control of insulin sensitivity, and both 5 'OT-EST and Ptpra lie within 0.21 cM of mg, 
another gene implicated in the suppression of obesity. In mouse, all three genes map to 
the same region as the mouse obesity locus Mob5 (Encyclopaedia of the Mouse genome 
Vn: Mouse Ghromosome 2; (1998) Peters et al, Mamm. genome 8 Spec No:S27-49). It 
is likely therefore that 5 'OT-EST contributes to the trait observed at this locus in mice. 

20 

Accordingly, the present invention provides 5 'OT-EST polypeptide. 5 'OT-EST 
according to the present invention may be mouse, rat or hxmian 5 'OT-EST, as well as 
variants of 5 'OT-EST derivable from other species or by natural or artificial mutation of a 
5 'OT-EST gtn^. 

25 

The variant provided by the present invention includes splice variants encoded by 
mRNA generated by alternative splicing of a primary transcript, amino acid mutants, 
glycosylation variants and other covalent derivatives of 5 'OT-EST which retain the 
physiological and/or physical properties thereof. Exemplary derivatives include 
30 molecules wherein 5 'OT-EST is covalently modified by substitution, chemical, 
enzymatic, or other appropriate means with a moiety other than a naturally occurring 
amino acid. Such a moiety may be a detectable moiety such as an enzyme or a 
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radioisotope. Further included are naturally occurring variants of 5'OT-EST found 
within a particular species, preferably a mammal. Such a variant may be encoded by a 
related gene of the same gene family, by an allelic variant of a particular gene, or 
represent an alternative splicing variant of 5'OT-EST. 

Variants which retain common structural features can be fragments of 5'OT-EST. 
Fragments of 5'OT-EST comprise smaller polypeptides derived from therefrom. 
Preferably, smaller polypeptides derived from 5'OT-EST according to the invention 
define a single feature which is characteristic of 5'OT-EST as described in the present 
application. 

Derivatives of 5'OT-EST also comprise mutants thereof, which may contain amino acid 
deletions, additions or substitutions. Thus, conservative amino acid substitutions may 
be made substantially without altering the nature of 5'OT-EST. Deletions and 
substitutions may moreover be made to the fragments of 5'OT-EST comprised by the 
invention. 

Mutants of 5'OT-EST according to the present invention may possess properties 
different from those of naturally occurring 5'OT-EST. In particular, 5'OT-EST 
mutants may modulate the expression of native 5'OT-EST. 

5'OT-EST mutants may be produced from a nucleic acid encoding 5'OT-EST which 
has been subjected to in vitro mutagenesis resulting e.g. in an addition, exchange 
and/or deletion of one or more amino acids. For example, substitutional, deletional or 
insertional variants of 5'OT-EST can be prepared by recombinant methods and screened 
for immuno-crossreactivity with the native forms of 5'OT-EST. 

Preferably, 5'OT-EST according to the present invention has the sequence of SEQ. E). 
No. 2 (rat), SEQ. ID. No. 4 (human) or SEQ. ID. No. 6 (mouse). Mutants possessing 
desired properties may be generated from these sequences, or isolated from natural 
sources, by a variety of techniques which assess the biological function of the 5'OT- 
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EST mutant. For example, nucleic acids encoding 5'OT-EST mutants may be used to 
generate transgenic animals and these animals assessed for indications of an obesity 
phenotype. 



5 For example, the effects of mutant transgenes may be assessed by carcass analysis, 
measurement of growth, body weight, body fat distribution, as well as other measures of 
analytes in body fluids or tissues relevant to obesity in transgenic animals (Mathe, 1995; 
Shillabeer, 1992). These include, but are not limited to, cholesterol, triglycerides, fatty 
acids, lipoproteins, and other dietary constituents or metabolites, as well as metabolic 

10 hormones, such as leptin, insulin, glucagon, catecholamines or glucocorticoids. Other 
relevant parameters include cardiovascular measures (Reaven, 1988, Gray & Yudkin, 
1997). These may include measures of systolic or diastolic blood pressure, cardiac 
output, or vascular resistance, together wdth morphological changes to organ systems 
known to be affected by cardiovascular or obesity disorders, such as heart, major or minor 

15 blood vessels, their muscle or endothelial layers, and their elasticity or fragility. See for 
example McNamee et al (1994). 

Similarly, parameters related to the infertility phenotype that may be measured, include, 
but are not limited to, testicular weight, volume, development, spermatogenesis, sperm 

20 number, motility or ability to fertilise oocytes. They may also include measures of 
testicular fluid production and constituents, as well as products of other accessory organs 
including seminal vesicles or prostate, as well as hormones, receptors, and proteins 
important in male sexual function, such as testosterone, LH, FSH, inhibin or activin. 
Other responses that may be affected include energy expenditure, physical activity, 

25 ingestive behaviour, excretory behaviour, or reproductive behaviour, or the organs, 
hormones or receptors commonly recognised to be associated with these physiological 
systems, their metabolism or morphological structure. 

5'OT-EST as disclosed herein is a polypeptide composed of four exons, termed w, x, y 
30 and z (see SEQ. ID. No. 16). Advantageously, mutants of 5'OT-EST are mutated in, or 
preferably lack all or part of, the sequences encoded by one or more exons of 5'-OT-EST. 
Preferably, mutants of 5'OT-EST lack, or are mutated in, all or part of the sequences 
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encoded by axons x, y and z of 5'-OT-EST. 

Preferably, the sequences encoded by exons x, y and z are deleted and those encoded by 
exon w partially deleted. Most preferably, the mutant is 5'OT-EST - xdel as described 
5 herein, for example in SEQ. ID. No. 8. 

The fragments, mutants and other derivatives of 5'OT-EST preferably retain substantial 
homology with 5'OT-EST. As used herein, "homology" means that the two entities 
share sufficient characteristics for the skilled person to determine that they are similar 
10 in origin and function. Preferably, homology is used to refer to sequence identity. 
Thus, the derivatives of 5'OT-EST preferably retain substantial sequence identity with 
5'OT-EST. 

"Substantial homology", where homology indicates sequence identity, means more than 
15 40% sequence identity, preferably more than 45% sequence identity and most 
preferably a sequence identity of 50% or more, as judged by direct best-fit sequence 
alignment and comparison. 

Sequence homology (or identity) may moreover be determined using any suitable 
20 homology algorithm, using for example default parameters. Advantageously, the 
BLAST algorithm is employed, with parameters set to default values. The BLAST 
algorithm is described in detail at http://www.ncbi.nih.gov/BLAST/blast_help.html, 
which is incorporated herein by reference. The search parameters are defined as 
follows, and are advantageously set to the defined default parameters. 

25 

Advantageously, "substantial homology" when assessed by BLAST equates to 
sequences which match with an EXPECT value of at least about 7, preferably at least 
about 9 and most preferably 10 or more. The default threshold for EXPECT in BLAST 
searching is usually 10. 

30 

BLAST (Basic Local Alignment Search Tool) is the heuristic search algorithm 
employed by the programs blastp, blastn, blastx, tblastn, and tblastx; these programs 
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ascribe significance to their findings using the statistical methods of Karlin and Altschul 
(see http://www.ncbi.nih.gov/BLAST/blast_help.htnil) with a few enhancements. The 
BLAST programs were tailored for sequence similarity searching, for example to 
identify homologues to a query sequence. The programs are not generally useful for 
5 motif-style searching. For a discussion of basic issues in similarity searching of 
sequence databases, see Altschul et al. (1994), 

The five BLAST programs available at http://www.ncbi.nlm.nih.gov perform the 
following tasks: 

10 

blastp compares an amino acid query sequence against a protein sequence database; 

blastn compares a nucleotide query sequence against a nucleotide sequence database; 

15 blastx compares the sk-frame conceptual translation products of a nucleotide query 
sequence (both strands) against a protein sequence database; 

tblastn compares a protein query sequence against a nucleotide sequence database 
dynamically translated in all six reading frames (both strands). 

20 

tblastx compares the six-frame translations of a nucleotide query sequence against the 
six-frame translations of a nucleotide sequence database. 

BLAST uses the following search parameters: 

25 

HISTOGRAM Display a histogram of scores for each search; default is yes. (See 
parameter H in the BLAST Manual). 

DESCRIPTIONS Restricts the number of short descriptions of matching sequences 
30 reported to the number specified; default limit is 100 descriptions. (See parameter V in 
the manual page). See also EXPECT and CUTOFF. 
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ALIGNMENTS Restricts database sequences to the number specified for which high- 
scoring segment pairs (HSPs) are reported; the default limit is 50. If more database 
sequences than this happen to satisfy the statistical significance threshold for reporting 
5 (see EXPECT and CUTOFF below), only the matches ascribed the greatest statistical 
significance are reported. (See parameter B in the BLAST Manual). 

EXPECT The statistical significance threshold for reporting matches against database 
sequences; the default value is 10, such that 10 matches are expected to be found 
10 merely by chance, according to the stochastic model of Karlin and Altschul (1990). If 
the statistical significance ascribed to a match is greater than the EXPECT threshold, 
the match will not be reported. Lower EXPECT thresholds are more stringent, leading 
to fewer chance matches being reported. Fractional values are acceptable. (See 
parameter E in the BLAST Manual). 

15 

CUTOFF Cutoff score for reporting high-scoring segment pairs. The default value is 
calculated fi:om the EXPECT value (see above). HSPs are reported for a database 
sequence only if the statistical significance ascribed to them is at least as high as would 
be ascribed to a lone HSP having a score equal to the CUTOFF value. Higher 
20 CUTOFF values are more stringent, leading to fewer chance matches being reported. 
(See parameter S in the BLAST Manual). Typically, significance thresholds can be 
more intuitively managed using EXPECT. 

MATRIX Specify an alternate scoring matrix for BLASTP, BLASTX, TBLASTN and 
25 TBLASTX. The default matrix is BLOSUM62 (Henikoff & Henikoff, 1992). The valid 
alternative choices include: PAM40, PAM120, PAM250 and IDENTITY. No alternate 
scoring matrices are available for BLASTN; specifying the MATRIX directive in 
BLASTN requests returns an error response. 
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STRAND Restrict a TBLASTN search to just the top or bottom strand of the database 
sequences; or restrict a BLASTN, BLASTX or TBLASTX search to just reading 
frames on the top or bottom strand of the query sequence. 

5 FILTER Mask off segments of the query sequence that have low compositional 
complexity, as determined by the SEG program of Wootton & Federhen (1993) 
Computers and Chemistry 17:149-163, or segments consisting of short-periodicity 
internal repeats, as determined by the XNU program of Claverie & States (1993) 
Computers and Chemistry 17:191-201, or, for BLASTN, by the DUST program of 
10 Tatusov and Lipman (see http://www.ncbi.nlm.nih.gov). Filtering can eliminate 
statistically significant but biologically uninteresting reports from the blast output (e.g., 
hits against common acidic-, basic- or proline-rich regions), leaving the more 
biologically interesting regions of the query sequence available for specific matching 
against database sequences. 

15 

Low complexity sequence found by a filter program is substituted using the letter "N" 
in nucleotide sequence (e.g., "NNNNNNNNNNNNN") and the letter "X" in protein 
sequences (e.g., "XXXXXXXXX"). 

20 Filtering is only applied to the query sequence (or its translation products), not to 
database sequences. Default filtering is DUST for BLASTN, SEG for other programs. 

It is not unusual for nothing at all to be masked by SEG, XNU, or both, when applied 
to sequences in SWISS-PROT, so filtering should not be expected to always yield an 
25 effect. Furthermore, in some cases, sequences are masked in their entirety, indicating 
that the statistical significance of any matches reported against the unfiltered query 
sequence should be suspect. 

NCBI-gi Causes NCBI gi identifiers to be shown in the output, in addition to the 
30 accession and/or locus name. 
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Most preferably, sequence comparisons are conducted using the simple BLAST search 
algorithm provided at http://www.ncbi.nlm.nih.gov/BLAST. 

Conventional BLAST serches of the publically available databases do not reveal any 
5 homology of the predicted protein product of 5'OT-EST to any known protein. 
However, application of a more sophisitcated search algorithm, as described in Taylor 
et aL, 1998, identifies structural similarities to apolipoprotein E (ApoE) in its alpha- 
helical domains, but without any apparent LDL-receptor domain. Since ApoE is 
centrally involved in lipid metabolism and transport, a role for 5'OT-EST in cellular 
10 lipid handling is suggested. 

Accordingly, the invention provides a method for identifying a candidate compound 
capable of influencing lipid transport, comprising the steps of: 

15 a) contacting 5'OT-EST polypeptide with a candidate compound or compounds and 
determining which candidate compound or compounds is capable of interacting with 
5'OT-EST; 

b) optionally, testing candidate compounds which interact with 5'OT-EST in a 
transgenic animal according to the invention. 

20 

According to a further aspect of the present invention, there is provided a nucleic acid 
encoding 5'OT-EST or a mutant thereof. In addition to being usefixl for the production of 
recombinant 5'OT-EST protein, these nucleic acids are also useful as probes, thus readily 
enabling those skilled in the art to identify and/or isolate nucleic acid encoding 5'OT-EST 

25 and/or mutant 5'OT-EST. The nucleic acid may be unlabelled or labelled with a 
detectable moiety. Furthermore, nucleic acid according to the invention is useful e.g. in a 
method determining the presence of 5'OT-EST-specific nucleic acid, said method 
comprising hybridising the DNA (or RNA) encoding 5'OT-EST (or its complement) to 
test sample nucleic acid and determining the presence of 5'OT-EST, In another aspect, the 

30 invention provides a nucleic acid sequence that is complementary to, or hybridises under 
stringent conditions to, a nucleic acid sequence encoding 5'OT-EST. 
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The invention also provides a method for amplifying a nucleic acid test sample 
comprising priming a nucleic acid polymerase (chain) reaction with nucleic acid 
corresponding to 5 'OT-EST, including the untranslated regions (or its complement). 

5 In still another aspect of the invention, the nucleic acid is DNA and further comprises a 
replicable vector comprising the nucleic acid encoding 5'OT-EST operably linked to 
control sequences recognised by a host transformed by the vector. Furthermore the 
invention provides host cells transformed v^ith such a vector and a method of using a 
nucleic acid encoding 5'OT-EST to effect the production of 5'OT-EST, comprising 
10 expressing 5 'OT-EST nucleic acid in a culture of the transformed host cells and, if 
desired, recovering 5'OT-EST from the host cell culture. 

Isolated 5 VT-EST nucleic acid includes nucleic acid that is free from at least one 
contaminant nucleic acid with which it is ordinarily associated in the natural source of 

15 5 'Or-^^T nucleic acid or in crude nucleic acid preparations, such as DNA libraries and 
the like. Isolated nucleic acid thus is present in other than in the form or setting in which it 
is found in nature. However, isolated 5'OT-EST encoding nucleic acid includes S'OT- 
EST nucleic acid in ordinarily 5'OT-EST-expressing cells where the nucleic acid is in a 
chromosomal location different from that of natural cells or is otherwise flanked by a 

20 different DNA sequence than that found in nature. 

In accordance with the present invention, there are provided isolated nucleic acids, e.g. 
DNAs or RNAs, encoding 5'OT-EST, particularly mammalian 5'OT-EST, e.g. human 
5'OT-EST, or fragments thereof. In particular, the invention provides a DNA molecule 
25 encoding 5'OT-EST, or a fragment thereof. By definition, such a DNA comprises a 
coding smgle stranded DNA, a double stranded DNA of said coding DNA and 
complementary DNA thereto, or this complementary (single stranded) DNA itself. An 
exemplary nucleic acid encoding 5'OT-EST is represented in SEQ ID Nos. 1, 3 and/or 5. 

30 The preferred sequence encoding 5'OT-EST is that having substantially the same 
nucleotide sequence as the coding sequences in SEQ DD Nos. 1, 3 and/or 5, with the 
nucleic acid having the same sequence as the coding sequence in SEQ ID Nos. 1 , 3 and/or 
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5 being most preferred. As used herein, nucleotide sequences which are substantially the 
same share at least about 90% identity. However, in the case of splice variants having e.g. 
an additional exon sequence homology may be lower. Homology is determined as 
described above. 

5 

The invention moreover provides nucleic acids encoding 5'OT-EST, comprising the gene 
5 'OT'EST or variants thereof as defined herein. The nucleic acids of the invention, 
whether used as probes or otherwise, are preferably substantially homologous to the 
sequence of 5 'OT-EST sls. shown in SEQ ID Nos. 1, 3 and/or 5. The terms "substantially" 
10 and "homologous" are used as hereinbefore defined with reference to the 5'OT-EST 
polypeptide. 

Preferably, nucleic acids according to the invention are fragments of the 5'OT-EST' 
sequence, or derivatives thereof as hereinbefore defined in relation to polypeptides. 
15 Fragments of the nucleic acid sequence of a few nucleotides in length, preferably 5 to 150 
nucleotides in length, are especially useful as probes. 

Exemplary nucleic acids can altematively be characterised as those nucleotide sequences 
which encode a 5'OT-EST protein, or which correspond to untranslated regions of 5'OT- 
20 EST, and hybridise to the DNA sequences set forth SEQ ID Nos. 2, 4 and/or 6, or a 
selected fragment of said DNA sequence. Preferred are such sequences encoding 5'OT- 
EST which hybridise under high-stringency conditions to the sequence of SEQ ID Nos. 1, 
3 and/or 5. 

25 Stringency of hybridisation refers to conditions under which polynucleic acids hybrids are 
stable. Such conditions are evident to those of ordinary skill in the field. As known to 
those of skill in the art, the stability of hybrids is reflected in the melting temperatxire 
(Tm) of the hybrid which decreases approximately 1 to 1.5*'C with every 1% decrease in 
sequence homology. In general, the stability of a hybrid is a function of sodium ion 

30 concentration and temperature. Typically, the hybridisation reaction is performed under 
conditions of higher stringency, followed by washes of varying stringency. 
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As used herein, high stringency refers to conditions that permit hybridisation of only those 
nucleic acid sequences that form stable hybrids in 1 M Na+ at 65-68 °C. High stringency 
conditions can be provided, for example, by hybridisation in an aqueous solution 
containing 6x SSC, 5x Denhardfs, 1 % SDS (sodium dodecyl sulphate), 0.1 Na+ 
5 pyrophosphate and 0.1 mg/ml denatured salmon sperm DNA as non specific competitor. 
Following hybridisation, high stringency washing may be done in several steps, with a 
fmal wash (about 30 min) at the hybridisation temperature in 0.2 - O.lx SSC, 0.1 % SDS. 

Moderate stringency refers to conditions equivalent to hybridisation in the above 
10 described solution but at about 60-62 In that case the final wash is performed at the 
hybridisation temperature in Ix SSC, 0.1 % SDS. 

Low stringency refers to conditions equivalent to hybridisation in the above described 
solution at about 50-52 *^C. Li that case, the final wash is performed at the hybridisation 
15 temperature in 2x SSC, 0.1 % SDS. 

It is understood that these conditions may be adapted and duplicated using a variety of 
buffers, e.g. formamide-based buffers, and temperatures. Denhardf s solution and SSC are 
well known to those of skill in the art as are other suitable hybridisation buffers (see, e.g. 
20 Sambrook, et al, eds. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring 
Harbor Laboratory Press, New York or Ausubel, et aL, eds. (1990) Current Protocols in 
Molecular Biology, John Wiley & Sons, Inc.). Optimal hybridisation conditions have to 
be determined empirically, as the length and the GC content of the probe also play a role. 

25 Advantageously, the invention moreover provides nucleic acid sequence which are 
capable of hybridising, under stringent conditions, to a fragment of SEQ. ID. Nos. 1, 3, 5 
or 7. Preferably, the fragment is between 15 and 50 bases in length. Advantageously, it is 
about 25 bases in length, preferably about 20 bases in length. For differentiating between 
mutant and wild type 5'OT-EST by PCR reactions, 20mers are the preferred size, whilst 

30 for use as probes in, for example, Southern hybridisation, the use of 40mers is preferred. 
Riboprobes may be designed to be substantially any length, up to and including the entire 
length of the largest specific cDNA sequence. 
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Specifically included, moreover, are sequences complementary to the foregoing 
sequences. 

5 Given the guidance provided herein, the nucleic acids of the invention are obtainable 
according to methods well known in the art. For example, a DNA of the invention is 
obtainable by chemical synthesis, using polymerase chain reaction (PGR) or by screening 
a genomic library or a suitable cDNA library prepared from a source believed to possess 
5'OT-ESTBnd to express it at a detectable level. 

10 

Chemical methods for synthesis of a nucleic acid of interest are known in the art and 
include triester, phosphite, phosphoramidite and H-phosphonate methods, PGR and other 
autoprimer methods as well as oligonucleotide synthesis on solid supports. These methods 
may be used if the entire nucleic acid sequence of the nucleic acid is known, or the 
15 sequence of the nucleic acid complementary to the coding strand is available. 
Alternatively, if the target amino acid sequence is known, one may hifer potential nucleic 
acid sequences using known and preferred coding residues for each amino acid residue. 

An altemative means to isolate the gene encoding 5'OT-EST is to use PGR technology as 
20 described e.g. in section 14 of Sambrook et ai, 1989. This method requires the use of 
oligonucleotide probes that will hybridise to 5'OT-EST nucleic acid. Strategies for 
selection of oligonucleotides are described below. 

Libraries are screened with probes or analytical tools designed to identify the gene of 
25 interest or the protein encoded by it. For cDNA expression libraries suitable means 
include monoclonal or polyclonal antibodies that recognise and specifically bind to 5'OT- 
EST; oligonucleotides of about 20 to 80 bases in length that encode known or suspected 
S'OT'EST cDNA from the same or different species; and/or complementary or 
homologous cDNAs or fragments thereof that encode the same or a hybridising gene. 
30 Appropriate probes for screening genomic DNA libraries include, but are not limited to 
oligonucleotides, cDNAs or fragments thereof that encode the same or hybridising DNA; 
and/or homologous genomic DNAs or fragments thereof. 
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A nucleic acid encoding 5'OT-EST may be isolated by screening suitable cDNA or 
genomic libraries under suitable hybridisation conditions with a probe, i.e. a nucleic acid 
disclosed herein including oligonucleotides derivable from the sequences set forth in SEQ 
5 ID Nos, 1, 3 and/or 5. Suitable libraries are commercially available or can be prepared e.g. 
from cell lines, tissue samples, and the like. 

As used herein, a probe is e.g. a single-stranded DNA or RNA that has a sequence of 
nucleotides that includes between 10 and 50, preferably between 15 and 30 and most 

10 preferably at least about 20 contiguous bases that are the same as (or the complement of) 
an equivalent or greater number of contiguous bases set forth in SEQ ID Nos. 1, 3 and/or 
5. The nucleic acid sequences selected as probes should be of sufficient length and 
sufficiently unambiguous so that false positive results are minimised. The nucleotide 
sequences are usually based on conserved or highly homologous nucleotide sequences or 

15 regions of 5 'OT-EST, The nucleic acids used as probes may be degenerate at one or more 
positions. The use of degenerate oligonucleotides may be of particular importance where a 
library is screened from a species in which preferential codon usage in that species is not 
known. 

20 Preferred" regions from which to construct probes include 5' and/or 3' coding sequences, 
sequences predicted to encode ligand binding sites, and the like. For example, either the 
full-length cDNA clone disclosed herein or fragments thereof can be used as probes. 
Preferably, nucleic acid probes of the invention are labelled with suitable label means for 
ready detection upon hybridisation. For example, a suitable label means is a radiolabel. 

25 The preferred method of labelling a DNA fragment is by incorporating ""^^P dATP with 
the Klenow fragment of DNA polymerase in a random priming reaction, as is well known 
in the art. Oligonucleotides are usually end-labelled with ^~^^P-labelled ATP and 
polynucleotide kinase. However, other methods (e.g. non-radioactive) may also be used to 
label the fragment or oligonucleotide, including e.g. en2yme labelling, fluorescent 

30 labelling with suitable fluorophores and biotinylation. 
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Probes for cloning and amplifying 5'OT-EST, especially human 5'OT-EST, may be 
deduced from the sequence thereof provided herein. Preferred probes may be selected 
from the following: 

5 lU GGACAGCCCGAAGGACTACAGGT SEQ. ID. No. 18 

IL CGAAGAACTCCGCAGGGTCC SEQ. ID. No. 19 



2U AAGACCCGCCACGACCCG SEQ. ID. No. 2 0 

2L GAATCAGCACCCTCTCCGCC SEQ. ED. No. 21 

10 

3U TGCGGAGTTCTTCGTGCTGATGGAG SEQ. ID. No. 2 2 

3L GGTGCTCGGCGGCGTCCTTC SEQ. ID. No. 2 3 

4U GAGTGGCGGAGAGGGTGCTGA SEQ. ED. No, 2 4 

15 4L GGCCGAGGCTGAGCGGGG SEQ. ED. No, 2 5 

5U CTGAAGGACGCCGCCGAGCA SEQ. ED. No, 2 6 

5L CTCCAACGCCTGCCGCTGC SEQ. ED. No. 2 7 

20 6U GCAGGAGGAGCGGGAGCAGGA SEQ. ID. No. 2 8 

6L TCCAGTGCCCCGCAAGCCG SEQ. ED. No. 2 9 



Probes according to the invention are suitable for use as diagnostic reagents to amplify 
5'OT'EST and thereby enable the analysis of the nucleic acid for the presence of 
25 mutations, polymorphisms or other changes which could render an individual susceptible 
to obesity. 

After screening the library, e.g. with a portion of DNA including substantially the entire 
5'OT-EST-encoding sequence or a suitable oligonucleotide based on a portion of said 
30 DNA, positive clones are identified by detecting z hybridisation signal; the identified 
clones are characterised by restriction en2yme mapping and/or DNA sequence analysis, 
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and then examined, e.g. by comparison with the sequences set forth herein, to ascertain 
whether they include DNA encoding a complete 5'OT-EST (i.e., if they include 
translation initiation and termination codons). If the selected clones are incomplete, they 
may be used to rescreen the same or a different library to obtain overlapping clones. If the 
5 library is genomic, then the overlapping clones may include exons and introns. If the 
library is a cDNA library, then the overlapping clones will include an open reading frame, 
hi both instances, complete clones may be identified by comparison with the DNAs and 
deduced amino acid sequences provided herein. 

10 In order to detect any abnormality of endogenous 5'OT-EST, genetic screening may be 
carried out using the nucleotide sequences of the invention as hybridisation probes or as 
PGR primers, using which genomic nucleic acid may be amplified, and subsequently 
sequenced. Also, based on the nucleic acid sequences provided herein antisense-type 
therapeutic agents may be designed. 

15 

It is envisaged that the nucleic acid of the invention can be readily modified by nucleotide 
substitution, nucleotide deletion, nucleotide insertion or inversion of a nucleotide stretch, 
and any combination thereof Such mutants can be used e.g. to produce a 5'OT-EST 
mutant that has an amino acid sequence differing from the 5'OT-EST sequences as found 
20 in nature. Mutagenesis may be predetermined (site-specific) or random. A mutation which 
is not a silent mutation must not place sequences out of reading frames and preferably will 
not create complementary regions that could hybridise to produce secondary mRNA 
structure such as loops or hairpins. 

25 The invention accordingly specifically includes nucleic acids encoding mutants of 5'OT- 
EST, as defmed above. Such nucleic acids may be used for all the purposes identified 
above in relation to wild-type 5 'OT-EST nucleic acids. Particularly preferred are nucleic 

acids encoding 5'OT-EST - xdel, which preferably have the sequence 
ATGTTGCGGGCTTTGAACCGCCTGGCCGCGCGGCCCGGGGGCCAGCCCCCAACCCTGCTC 

30 CTTCTGCCCGTGCGCGGCCCACGGCCCCGCTCATTCTCGGCTCCTTTTTCCTCGCAGGAT 
AGO (see SEQ. ID. No. 7), or an equivalent sequence which encodes the same 
polypeptide having regard to the degeneracy of the nucleic acid code, or a sequence 
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substantially homologous thereto or complementary thereto. In 5'-OT-EST - xdel exon x 
is deleted, and exons y and z are out of frame and therefore not translated. 

For hybridisation probes, it may be desirable to use nucleic acid analogues, in order to 
5 improve the stability and binding affinity. A number of modifications have been 
described that alter the chemistry of the phosphodiester backbone, sugars or 
heterocyclic bases. 

Among useful changes in the backbone chemistry are phosphorothioates; 
10 phosphorodithioates, where both of the non-bridging oxygens are substituted with 
sulphur; phosphoroamidites; alkyl phosphotriesters and boranophosphates. Achiral 
phosphate derivatives include 3'-0'-5^-S-phosphorothioate, 3'-S-5'-0-phosphorothioate, 
3'-CH2-5'-0-phosphonate and 3'-NH-5'-0-phosphoroamidate. Peptide nucleic acids 
replace the entire phosphodiester backbone with a peptide linkage. 

15 

Sugar modifications are also used to enhance stability and affinity. The a-anomer of 
deoxyribose may be used, where the base is inverted with respect to the natural 
P-anomer. The 2'-OH of the ribose sugar may be altered to form 2'-0-methyl or 
2'-0-allyl sugars, which provides resistance to degradation without comprising affinity. 

20 

Modification of the heterocyclic bases must maintain proper base pairing. Some useful 
substitutions include deoxyuridine for deoxythymidine; 5-methyl-2'-deoxycytidine and 
5-bromo-2 ' -deoxycytidine for deoxycytidine . 5-propynyl-2 ' -deoxyuridine and 
5-propynyl-2'-deoxycytidine have been shown to increase affinity and biological 
25 activity when substituted for deoxythymidine and deoxycytidine, respectively. 

The DNA sequences, particularly nucleic acid analogues as described above, may be 
used as antisense sequences. 

30 In accordance with another embodiment of the present invention, there are provided 
cells containing the above-described nucleic acids. Such host cells such as prokaryote, 
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yeast and higher eukaryote cells may be used for replicating DNA and producing 5'OT- 
EST. Suitable prokaryotes include eubacteria, such as Gram-negative or Gram-positive 
organisms, such as E. coli, e.g. E. coli K-12 strains, DH5a and HBlOl, or Bacilli. 
Further hosts suitable for 5'OT-EST encoding vectors include eukaryotic microbes such 
5 as filamentous fungi or yeast, e.g. Saccharomyces cerevisiae. Higher eukaryotic cells 
include insect and vertebrate cells, particularly mammalian cells, including human cells, 
or nucleated cells from other multicellular organisms. The propagation of vertebrate 
cells in culture (tissue culture) is a routine procedure. Examples of useful mammalian 
host cell lines are epithelial or fibroblastic cell lines such as Chinese hamster ovary 
10 (CHO) cells, NIH 3T3 cells, HeLa cells or 293T cells. The host cells referred to in this 
disclosure comprise cells in in vitro culture as well as cells that are within a host 
animal, 

DNA may be stably incorporated into cells or may be transiently expressed using 
15 methods known in the art. Stably transfected manmialian cells may be prepared by 
transfecting cells with an expression vector having a selectable marker gene, and 
growing the transfected cells under conditions selective for cells expressing the marker 
gene. To prepare transient transfectants, mammalian cells are transfected with a 
reporter gene to monitor transfection efficiency. 

20 

To produce such stably or transiently transfected cells, the cells should be transfected 
with a sufficient amount of 5'OT-EST-encoding nucleic acid to form 5'OT-EST. The 
precise amounts of DNA encoding 5'OT-EST may be empirically determined and 
optimised for a particular cell and assay. 

25 

Host cells are transfected or, preferably, transformed with the above-captioned 
expression or cloning vectors of this invention and cultured in conventional nutrient 
media modified as appropriate for inducing promoters, selecting transformants, or 
amplifying the genes encoding the desired sequences. Heterologous DNA may be 
30 introduced into host cells by any method known in the art, such as transfection with a 
vector encoding a heterologous DNA by the calcium phosphate coprecipitation 
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technique or by electroporation. Numerous methods of transfection are known to the 
skilled worker in the field. Successful transfection is generally recognised when any 
indication of the operation of this vector occurs in the host cell. Transformation is 
achieved using standard techniques appropriate to the particular host cells used. 

5 

Incorporation of cloned DNA into a suitable expression vector, transfection of 
eukaryotic cells with a plasmid vector or a combination of plasmid vectors, each 
encoding one or more distinct genes or with linear DNA, and selection of transfected 
cells are well known in the art (see, e.g. Sambrook et al (1989) Molecular Cloning: A 
10 Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press). 

Transfected or transformed cells are cultured using media and culturing methods known 
in the art, preferably under conditions, whereby 5'OT-EST encoded by the DNA is 
expressed. The composition of suitable media is known to those in the art, so that they 
15 can be readily prepared. Suitable culturing media are also commercially available. 

The cDNA or genomic DNA encoding native or mutant 5'OT-EST can be incorporated 
into vectors for further manipulation. As used herein, vector (or plasmid) refers to 
discrete elements that are used to introduce heterologous DNA into cells for either 

20 expression or replication thereof. Selection and use of such vehicles are well within the 
skill of the artisan. Many vectors are available, and selection of appropriate vector will 
depend on the intended use of the vector, i.e. whether it is to be used for DNA 
amplification or for DNA expression, the size of the DNA to be inserted into the 
vector, and the host cell to be transformed with the vector. Each vector contains 

25 various components depending on its function (amplification of DNA or expression of 
DNA) and the host cell for which it is compatible. The vector components generally 
include, but are not limited to, one or more of the following: an origin of replication, 
one or more marker genes, an enhancer element, a promoter, a transcription 
termination sequence and optionally a signal sequence. 
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Both expression and cloning vectors generally contain nucleic acid sequence that enable 
the vector to replicate in one or more selected host cells. Typically in cloning vectors, 
this sequence is one that enables the vector to replicate independently of the host 
chromosomal DNA, and includes origins of replication or autonomously replicating 
5 sequences. Such sequences are well known for a variety of bacteria, yeast and viruses. 
The origin of replication from the plasmid pBR322 is suitable for most Gram-negative 
bacteria, the l\x plasmid origin is suitable for yeast, and various viral origins (e.g. SV 
40, polyoma, adenovirus) are useful for cloning vectors in mammalian cells. Generally, 
the origin of replication component is not needed for mammalian expression vectors 
0 unless these are used in mammalian cells competent for high level DNA replication, 
such as COS cells. 



Most expression vectors are shuttle vectors, i.e. they are capable of replication in at 
least one class of organisms but can be transfected into another class of organisms for 
15 expression. For example, a vector is cloned in E. coli and then the same vector is 
transfected into yeast or mammalian cells even though it is not capable of replicating 
independently of the host cell chromosome. DNA may also be replicated by insertion 
into the host genome. However, the recovery of genomic DNA encoding 5'OT-EST is 
more complex than that of exogenously replicated vector because restriction enzyme 
20 digestion is required to excise 5'OT-EST DNA. DNA can be amplified by PGR and be 
directly transfected into the host cells without any replication component. 

Advantageously, an expression and cloning vector may contain a selection gene also 
referred to as selectable marker. This gene encodes a protein necessary for the survival 
25 or growth of transformed host cells grown in a selective culture medium. Host cells not 
transformed with the vector containing the selection gene will not survive in the culture 
medium. Typical selection genes encode proteins that confer resistance to antibiotics 
and other toxins, e.g. ampicillin, neomycin, methotrexate or tetracycline, complement 
auxotrophic deficiencies, or supply critical nutrients not available from complex media. 
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As to a selective gene marker appropriate for yeast, any marker gene can be used which 
facilitates the selection for transformants due to the phenotypic expression of the 
marker gene. Suitable markers for yeast are, for example, those conferring resistance to 
antibiotics G418, hygromycin or bleomycin, or provide for prototrophy in an 
auxotrophic yeast mutant, for example the URA3, LEU2, LYS2, TRPl, or HISS gene. 

Since the replication of vectors is conveniently done in E. coli, an E. coli genetic 
marker and an E. coli origin of replication are advantageously included. These can be 
obtained from E. coli plasmids, such as pBR322, Bluescript® vector or a pUC plasmid, 
e.g. pUClS or pUC19, which contain both E. coli replication origin and E. coli genetic 
marker conferring resistance to antibiotics, such as ampicillin. 

Suitable selectable markers for mammalian cells are those that enable the identification 
of cells competent to take up 5'OT-EST nucleic acid, such as dihydrofolate reductase 
(DHFR, methotrexate resistance), thymidine kinase, or genes conferring resistance to 
G418 or hygromycin. The mammalian cell transformants are placed under selection 
pressure which only those transformants which have taken up and are expressing the 
marker are uniquely adapted to survive. In the case of a DHFR or glutamine synthase 
(GS) marker, selection pressure can be imposed by culturing the transformants under 
conditions in which the pressure is progressively increased, thereby leading to 
amplification (at its chromosomal integration site) of both the selection gene and the 
linked DNA that encodes 5'OT-EST. Amplification is the process by which genes in 
greater demand for the production of a protein critical for growth, together with closely 
associated genes which may encode a desired protein, are reiterated in tandem within 
the chromosomes of recombinant cells. Increased quantities of desired protein are 
usually synthesised from thus amplified DNA. 

Expression and cloning vectors usually contain a promoter that is recognised by the 
host organism and is operably linked to 5'OT-EST nucleic acid. Such a promoter may 
be inducible or constitutive. The promoters are operably linked to DNA encoding 
5'OT-EST by removing the promoter from the source DNA by restriction enzyme 
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digestion and inserting the isolated promoter sequence into the vector. Both the native 
5 'Or-^^r promoter sequence and many heterologous promoters may be used to direct 
amplification and/or expression of 5 'OT-ESTDN A. The term "operably linked" refers 
to a juxtaposition wherein the components described are in a relationship permitting 
them to function in their intended manner. A control sequence "operably linked" to a 
coding sequence is ligated in such a way that expression of the coding sequence is 
achieved under conditions compatible with the control sequences. 

Promoters suitable for use with prokaryotic hosts include, for example, the p-lactamase 
and lactose promoter systems, alkaline phosphatase, the tryptophan (trp) promoter 
system and hybrid promoters such as the tac promoter. Their nucleotide sequences have 
been published, thereby enabling the skilled worker operably to ligate them to DNA 
encoding 5'OT-EST, using linkers or adaptors to supply any required restriction sites. 
Promoters for use in bacterial systems will also generally contain a Shine-Delgarno 
15 sequence operably linked to the DNA encoding 5'OT-EST. 

Preferred expression vectors are bacterial expression vectors which comprise a 
promoter of a bacteriophage such as phagex or T7 which is capable of functioning in 
the bacteria. In one of the most widely used expression systems, the nucleic acid 
encoding the fusion protein may be transcribed from the vector by T7 RNA polymerase 
(Studier etal.. Methods in Enzymol. 185; 60-89, 1990). In the E. coli BL21(DE3) host 
strain, used in conjunction with pET vectors, the T7 RNA polymerase is produced from 
the X-lysogen DE3 in the host bacterium, and its expression is under the control of the 
IPTG inducible lac UV5 promoter. This system has been employed successfully for 
over-production of many proteins. Alternatively the polymerase gene may be 
introduced on a lambda phage by infection with an mt- phage such as the CE6 phage 
which is commercially available (Novagen, Madison, USA), other vectors include 
vectors containing the lambda PL promoter such as PLEX anvitrogen, NL) , vectors 
containing the trc promoters such as pTrcHisXpressTm (Invitrogen) or pTrc99 
(Pharmacia Biotech, SE) , or vectors containing the tac promoter such as pKK223-3 
(Pharmacia Biotech) or PMAL (new England Biolabs, MA, USA). 



20 
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Moreover, the 5'OT-EST gene according to the mvention preferably includes a 
secretion sequence in order to facilitate secretion of the polypeptide from bacterial 
hosts, such that it will be produced as a soluble native peptide rather than in an 
5 inclusion body. The peptide may be recovered from the bacterial periplasmic space, or 
the culture medium, as appropriate. 

Suitable promoting sequences for use with yeast hosts may be regulated or constitutive 
and are preferably derived from a highly expressed yeast gene, especially a 
10 Saccharomyces cerevisiae gene. Thus, the promoter of the TRPl gene, the ADHI or 
ADHII gene, the acid phosphatase (PH05) gene, a promoter of the yeast mating 
pheromone genes coding for the a- or a-factor or a promoter derived from a gene 
encoding a glycolytic enzyme such as the promoter of the enolase, glyceraldehyde-3- 
phosphate dehydrogenase (GAP), 3-phospho glycerate kinase (PGK), hexokinase, 
15 pyruvate decarboxylase, phosphofructokinase, glucose-6-phosphate isomerase, 3- 
phosphogly cerate mutase, pyruvate kinase, triose phosphate isomerase, phosphoglucose 
isomerase or glucokinase genes, the S, cerevisiae GAL 4 gene, the 5. pombe nmt 1 
gene or a promoter from the TATA binding protein (TBP) gene can be used. 
Furthermore, it is possible to use hybrid promoters comprising upstream activation 
20 sequences (UAS) of one yeast gene and downstream promoter elements including a 
functional TATA box of another yeast gene, for example a hybrid promoter including 
the UAS(s) of the yeast PH05 gene and downstream promoter elements including a 
functional TATA box of die yeast GAP gene (PH05-GAP hybrid promoter). A suitable 
constitutive PH05 promoter is e.g. a shortened acid phosphatase PH05 promoter 
25 devoid of the upstream regulatory elements (UAS) such as the PH05 (-173) promoter 
element starting at nucleotide -173 and ending at nucleotide -9 of the PH05 gene. 

5 'OT-EST gene transcription from vectors in mammalian hosts may be controlled by 
promoters derived from the genomes of viruses such as polyoma virus, adenovirus, 
30 fowlpox virus, bovine papilloma virus, avian sarcoma virus, cytomegalovirus (CMV), a 
retrovirus and Simian Virus 40 (SV40), from heterologous mammalian promoters such 
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as the actin promoter or a very strong promoter, e.g. a ribosomal protein promoter, and 
from the promoter normally associated with 5'OT-EST sequence, provided such 
promoters are compatible with the host cell systems. 

5 Transcription of a DNA encoding 5'OT-EST by higher eukaryotes may be increased by 
inserting an enhancer sequence into the vector. Enhancers are relatively orientation and 
position independent. Many enhancer sequences are known from mammalian genes 
(e.g. elastase and globin). However, typically one will employ an enhancer from a 
eukaryotic cell virus. Examples include the SV40 enhancer on the late side of the 
10 replication origin (bp 100-270) and the CMV early promoter enhancer. The enhancer 
may be spliced into the vector at a position 5' or 3' to 5'0T~EST DNA, but is 
preferably located at a site 5' from the promoter. 

Advantageously, a eukaryotic expression vector encoding 5'OT-EST may comprise a 
15 locus control region (LCR). LCRs are capable of directing high-level integration site 
independent expression of transgenes integrated into host cell chromatin, which is of 
importance especially where the 5 'OT~EST gene is to be expressed in the context of a 
permanently-transfected eukaryotic cell line in which chromosomal integration of the 
vector has occurred, in vectors designed for gene therapy applications or in transgenic 
20 animals. 

Eukaryotic expression vectors will also contain sequences necessary for the termination 
of transcription and for stabilising the mRNA. Such sequences are commonly available 
from the 5' and 3' untranslated regions of eukaryotic or viral DNAs or cDNAs. These 
25 regions contain nucleotide segments transcribed as polyadenylated fragments in the 
untranslated portion of the mRNA encoding 5'OT-EST. 

An expression vector includes any vector capable of expressing 5 'OT-EST nucleic acids 
that are operatively linked with regulatory sequences, such as promoter regions, that 
30 are capable of expression of such DNAs. Thus, an expression vector refers to a 
recombinant DNA or RNA construct, such as a plasmid, a phage, recombinant virus or 
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other vector, that upon introduction into an appropriate host cell, results in expression 
of the cloned DNA. Appropriate expression vectors are well known to those with 
ordinary skill in the art and include those that are replicable in eukaryotic and/or 
prokaryotic cells and those that remain episomal or those which integrate into the host 
5 cell genome. For example, DNAs encoding 5'OT-EST may be inserted into a vector 
suitable for expression of eDNAs in mammalian cells, e.g. a CMV enhancer-based 
vector such as pEVRF (Matthias, etal, (1989) NAR 17, 6418). 

Particularly useful for practising the present invention are expression vectors that 
10 provide for the transient expression of DNA encoding 5'OT-EST in mammalian cells. 
Transient expression usually involves the use of an expression vector that is able to 
replicate efficiently in a host cell, such that the host cell accumulates many copies of the 
expression vector, and, in turn, synthesises high levels of 5'OT-EST. For the purposes 
of the present invention, transient expression systems are useful e.g. for identifying 
15 5'OT-EST mutants, to identify potential phosphorylation sites, or to characterise 
functional domains of the protein. 



Construction of vectors according to the invention employs conventional ligation 
techniques. Isolated plasmids or DNA fragments are cleaved, tailored, and religated in 

20 the form desired to generate the plasmids required. If desired, analysis to confirm 
correct sequences in the constructed plasmids is performed in a known fashion. Suitable 
methods for constructing expression vectors, preparing in vitro transcripts, introducing 
DNA into host cells, and performing analyses for assessing 5'OT-EST expression and 
function are known to those skilled in the art. Gene presence, amplification and/or 

25 expression may be measured in a sample directly, for example, by conventional 
Southern blotting. Northern blotting to quantitate the transcription of mRNA, dot 
blotting (DNA or RNA analysis), or in situ hybridisation, using an appropriately 
labelled probe which may be based on a sequence provided herein. Those skilled in the 
art will readily envisage how these methods may be modified, if desired. 

30 

In a further aspect, the present invention provides a transgenic non-human animal which 
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expresses, as a result of transformation with a transgene, 5'OT-EST or a mutant thereof as 
defined herein. Preferred animals include mammals, especially rats. 

Preferably, the non-human animal is a mammal suitable for use as a test system for 
therapies and treatments relating to obesity, including himian obesity and animal obesity, 
which is of concem in household animals such as cats and dogs. Thus, the mammal may 
be a cat or a dog, or other household pet; it is preferably a rodent, such as a mouse or a rat, 
particularly a rodent adapted for laboratory testing whose genotype and general 
characteristics are well known. 

Any technique may be used to generate transgenic animals according to the invention. 
Preferably, the technique involves transfer of a transgene comprising 5 'OT-EST to the 
pronucleus of a single-cell embryo, prior to implantation of the embryo into a 
pseudopregnant foster mother. Such techniques have the advantage that germ-line 
transgenic animals are readily produced. 

Altematively, transgenic annuals may be created by ES cell transfer techniques. In such 
techniques, ES cells are transformed with the desired transgene and then used to 
reconstitute an embryo. Animals created by such techniques are normally chimeric for the 
transgene. However, more accurate positional insertion of the transgene is possible, and 
selective deletion of endogenous genes by homologous recombination is facilitated 
(Mansour e^a/., 1989). 

Further techniques include targeted or non-targeted delivery of genes to whole animals, 
using viral or non-viral vectors. For example, genes may be delivered by recombinant 
retroviruses or adenovius vectors, including adeno-assisted virus vectors, which are 
capable of integrating into the genome of the animal and expressing the delivered gene. 
Non-viral vectors include liposomal vectors, antibody-targeted DNA-protein complexes 
and the like. 

As used herein, "transgenic" animals include animals from which 5 'OT-EST has been 
deleted, as well as animals to which a 5 'OT-EST transgene has been added. Optionally, 
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the endogenous 5 'OT-EST may be deleted, and a transgene bearing a heterologous or 
homologous 5 'OT-EST gene, which may be wild-type or mutated, inserted into the 
animal. 

5 Preferred vectors for creating transgenic animals include linearised naked DNA from a 
variety of sources. In a preferred embodiment, transgenes may be derived from linearised 
cosmid sequences, from which the phage-related sequences have optionally been 
removed. 

10 The 5 'OT-EST sequences used in a transgene according to the present invention may be 
inserted separately, or together with fiarther sequences, including reporter genes, fiirther 
effector genes and the like. Preferably, 5 'OT-EST is comprised in a nucleic acid fragment 
which comprises the natural wild-type environment of 5 'OT-EST, including flanking 
sequences. 

15 

5 'OT-EST is located proximal to the vasopressin (A VP) and oxytocin (OT) genes in the 
genome, being transcribed in opposite directions from positions closely linked in a single 
locus. 5 'OT-EST lies 5' of the OT gene in at the OT/AVP locus. Accordingly, the 
transgene preferably consists of the OT/AVP locus, including 5 'OT-EST. 
20 Advantageously, one or more of the OT, AW and 5 'OT-EST genes may be mutated, for 
example by insertion of reporter genes, such as the hGH gene. 

In a highly preferred aspect, the fransgene is cosmid cV014 as described in Figure 4 
herein. The complete sequence of cV014 is set forth in SEQ. ID. No. 17. 

25 

Transgenic animals according to the invention may comprise single copies of the 
transgene, or may comprise multiple integrated copies, which may be present as 
concatamers. Preferably, transgenic animals according to the invention comprise four or 
more copies of the transgene. 

30 

Transgenic animals according to the invention may be employed for a variety of purposes. 
The characteristics of male specificity, cenfral distribution of adiposity, late onset and 
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severity, and associated morbidity have parallels in the description of several human 
forms of central obesity. These include, but are not limited to, the condition known as 
metabolic syndrome, or Syndrome X, as well as other forms of central obesity which may 
be most severely expressed in human males, with or without reduced fertility and which 
5 are associated with increased morbidity. (For recent reviews of the importance of clinical 
and health care issues in obesity see Science 1998, vol. 280 pp. 1364-1390). Transgenic 
animals expressing 5 'OT-EST or mutants thereof thus have particular beneficial utility as 
a novel animal model of late-onset human visceral obesity, preponderant in males. 

10 Moreover, the induction of the SLOB phenotype in juvenile rats as a result of dietary fat 
increases suggests that transgenic animals expressing 5 'OT-EST are a model for juvenile 
obesity in mammals, predominantly male mammals, which is induced by the consumption 
of a high-fat diet. 

15 Furthermore, the onset of obesity in ovariectomised SLOB female rats suggests the model 
may be suitable to investigate post-menopausal obesity in female mammals. 

For instance, one recognised value of animals bearing 5 'OT-EST or mutant constructs is 
to use such animals and their nontransgenic littermates as animal experimental models for 

20 studying obesity or male infertility and their related conditions. Using the information 
disclosed herein, it is possible to identify transgenic animals before they become obese or 
sexually mature, and to use them as a model for studying the factors that affect the 
development of obesity or male infertility in any animal classified as a mammal, including 
humans, domestic, and farm animals, and zoo, sports, or pet animals, such as but not 

25 limited to sheep pigs, cows , horses, dogs, cats, etc. 

In particular, rodent models of obesity or infertility are of value in testing the ability of 
pharmaceutical preparations of novel agents, to be beneficial in delaying or preventing the 
occurrence, development, course, severity, progression, or exacerbation of obesity or 
30 infertility (Mathe, 1995; Fan et al, 1997). Animals bearing 5'OT'EST or mutant 
constructs are particularly useful in testing agents in this regard, since the phenotype is 
predictable and non-transgenic littermates are ideal controls. 
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In addition to screening for unknown compounds, animals bearing 5 'OT-EST or mutants 
thereof may be particularly useful in studies employing administration of natural or 
recombinant proteins, peptides or other agents or their derivatives already known or 
suspected to be involved in some forms of obesity or male infertility (e.g. growth 
hormones, or reproductive hormones, their homologues, analogues, antagonists, inhibitors 
or secretagogues, or leptin, its homologues, analogues and antagonists) or other natural or 
pharmacological agents already known to be active and/or of therapeutic value in these 
conditions (e.g. insulin, thiazolidinones, catecholamines, gonadal steroids) or agents 
already known to affect their actions, distribution, catabolism or elimination). 

Typically in such studies, compounds may be administered to animals bearing 5 'OT-EST 
or mutants thereof and their non-transgenic littermates by oral, parenteral (e.g., 
intramuscular, intraperitoneal, intravenous, or subcutaneous injection or infusion, or 
implant), nasal, pulmonary, rectal, sublingual, or topical routes of administration, and can 
be formulated in dosage forms appropriate for each route of administration, e.g. in soluble 
form, suspension, or other suitable pharmaceutical formulations. 

For example, the effects of such compounds on the obese phenotype may be assessed by 
carcass analysis, measurement of growth, body weight, body fat distribution, as well as 
other measures of analytes in body fluids or tissues relevant to obesity (Mathe, 1995; 
Shillabeer, 1992). These include, but are not limited to, cholesterol, triglycerides, fatty 
acids, lipoproteins, and other dietary constituents or metabolites, as well as metabolic 
hormones, such as leptin, insulin, glucagon, catecholamines or glucocorticoids. Other 
relevant parameters include cardiovascular measures (Reaven, 1988, Gray & Yudkin, 
1997). These may include measures of systolic or diastolic blood pressure, cardiac 
output, or vascular resistance, together with morphological changes to organ systems 
known to be affected by cardiovascular or obesity disorders, such as heart, major or minor 
blood vessels, their muscle or endothelial layers, and their elasticity or fragility. See for 
example McNamee et al. (1994). 

Similarly, parameters related to the infertility phenotype that may be measured, include. 
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but are not limited to, testicular weight, volume, development, spermatogenesis, sperm 
nimiber, motility or ability to fertilise oocytes. They may also include measures of 
testicular fluid production and constituents, as well as products of other accessory organs 
including seminal vesicles or prostate, as well as hormones, receptors, and proteins 
5 important in male sexual function, such as testosterone, LH, FSH, inhibin or activin. 
Other responses that may be affected include energy expenditure, physical activity, 
ingestive behaviour, excretory behaviour, or reproductive behaviour, or the organs, 
hormones or receptors commonly recognised to be associated with these physiological 
systems, their metabolism or morphological structure. 

10 

Compounds identified as effective in such screening or analysis based on the use of 
animals bearing 5 'OT-EST or mutants thereof are particularly useful in treatment of late- 
onset visceral obesity, or male infertility, in particular where they occur in combination, 
and disorders related to these conditions with a view to delaying or preventing the 
15 occurrence, development, course, severity or progression of the phenotype, avoiding its 
exacerbation, and preferably promoting its amelioration or cure in animals of commercial 
importance, or more preferably in humans. 

In another embodiment, converse but also therapeutically valuable compounds may be 
20 developed based on screening or analysis as above in animals bearing 5 'OT-EST or 
mutants thereof but which are intended to promote the occurrence, development, or 
progression of increased fat deposition or increased calorie intake or decreased energy 
consumption. Such disorders in humans include, but are not limited to, wasting, or 
anorexia, or cachexia, associated with prolonged illness, or malabsorptive states or 
25 catabolic states associated with other diseases, such as, but not limited to, inflammatory 
conditions, Crohns disease, or AIDS wasting, or bums, or cancer, or bone disease. 

Similarly, therapeutically valuable approaches may be developed based on screening or 
analysis as above in animals bearing 5 'OT-EST or mutants thereof but which reduce the 
30 degree of male fertility in those conditions in which it might be beneficial. For example, 
this may be beneficial in the control of populations in animals of conmiercial or 
environmental importance, or to develop novel forms of contraception which may be 
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effective in human males. Such approaches specifically include, but are not limited to, 
the possibility of blocking 5 'OT-EST or mutants thereof function by administering 5 'OT- 
EST mutant products or by immunisation against 5 'OT-EST to generate neutralising 
antibodies that interfere with the normal functioning of this gene product in the testis, or 
5 hypothalamus or adrenal gland or gastrointestinal tract, or other organ system in which 
5 'OT-EST is expressed or upon which its products act. 

The development and late-onset of obesity in transgenic animals may be particularly 
useful in studying the chronic effects of novel food additives or formulations designed to 

10 prevent or exacerbate the deposition of fat in animals of commercial importance, of 
destined for use in human food products or dietary aids. Such compounds may be 
administered as above and their effects on the development, course, severity, progression, 
exacerbation, amelioration or cure of the obesity phenotype assessed as described above. 
Additives or formulations shown to reduce the development of visceral obesity in this 

15 model may have utility in human food products or dietary aids or find beneficial 
medicinal use in reducing fat accretion or retention. 

In another embodiment, transgenic animals, such as mice bearing 5 'OT-EST or mutants 
thereof or in which 5 'OT-EST has been disrupted, may be usefully intercrossed with other 

20 animal strains with defined mutations, or with undefined genetic backgrounds associated 
with propensity for the development of obesity or infertility. Comparison of the resulting 
progeny with or without the 5 'OT-EST transgene will provide additional information on 
the alterations in occurrence, development, course, severity, progression, exacerbation, 
amelioration or cure of the obesit>' phenot>pe when expressed in these other genetic 

25 backgrounds, and analysed as described above. Such intercrossing may then be envisaged 
to enhance the utility of the resulting progeny exhibiting the obesity phenotype. 

Examples of this use mclude (without being limited to) interbreeding with Zucker fa/fa 
rats (lida et al, 1996), corpulent (cp) rats (Kahle et ai, 1997), OLETF rats (Takiguchi et 
30 ai, 1998), ZDF rats, tfm rats, spontaneously hypertensive or salt-sensitive rats 
(Michaelis et al, 1995) or other dwarf rats such as dw/dw (Charlton et al, 1988) or dr/dr 
rats (Takeuchi et al., 1991). An example of the utility of this approach is given by 
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(Michaelis et al, 1995). Examples of mouse lines that may be usefully interbred with 
mice carrying transgenes or deletions affecting 5 'OT-EST include, but are not limited to, 
ob/ob. db/db, tfin/tfin or hpg/hpg mice. A related example includes intercrossing mice 
carrying transgenes or deletions affecting 5 'OT-EST with other strains of mice in which 
5 genes already known to be involved in obesity or male fertility have been deleted by 
homologous recombination or introduced by transgenesis (singly or in combination). 
Examples of these are akeady known to include (but are not limited to) leptin, tubby and 
related genes, NPY, insulin, GLP-1, IGF-1, IGF-H, MCH, CRH, POMC, CCK, orexins or 
hypocretins, CART peptides, agouti protein, as well as the genes or altemate products 

10 structurally related to or homologous with, the above peptides. This example is also 
intended to include mice with disruptions in or extra copies of normal or mutated forms 
of, genes for the specific receptors of the peptides listed above (for example NPY 
receptors, such as subtype 5), or bombesin-receptor 3, IRS-1 or 2, imcoupling proteins 
such as UCPl-3, carboxypeptidase E, or PPARs or adrenergic receptor subtype 3 or TNT 

15 alpha or, all of which have been implicated in obesity. 

Similarly, the fertility disruption in transgenic animals according to the invention may 
also be studied to advantage by crossing these animals or other animals in which 5 'OT- 
EST has been disrupted onto genetic backgrounds in which genes for gonadal or adrenal 

20 steroid biosynthesis or metabolism or gonadal steroid receptors and other reproductive 
hormones or their receptors or hypothalamic or pituitary hormones thought to affect male 
fertility (such as gonadotrophins, activins, inhibins, PRL, GnRH or transcription factors 
such as DAXl or SFl, or other known gene products affectmg male gonadal 
development, such as MIS, AMH, SCF) have been disrupted. Comparison of the resulting 

25 progeny with or without the 5'OT'EST or mutant transgene may shed light on the 
alterations in occurrence, development, course, severity, progression, exacerbation, 
amelioration or cure of the infertility phenotype when expressed in these other genetic 
backgrounds. Such intercrossed lines, for example with those genetic strains as outlined 
above, and in which the obesity phenotype is present in full or in a modified form, may 

30 also be additionally useful for screening applications. 

Conversely, the transfer of the obese phenotype onto these genetic backgrounds may also 
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alter the occurrence, development, coiirse, severity, progression, exacerbation, 
amelioration or cure of the specific phenotypes expressed in the strain with which animals 
bearing 5 'OT-EST or mutants thereof are bred. Such intercrossed lines, for example with 
those genetic strains as outlined above, and in which the obesity phenotype is present in 
5 full or in a modified form, may also be useful for screening applications. 

Animals bearing 5 'OT-EST or mutants thereof may be used to study the transfer of other 
gene products other than by breeding, e.g. by administration of suitable vectors containing 
constructs expressing proteins of interest, or by transgenesis. Such examples include, but 

10 are not limited to, constructs containing the gene products or analogues of other genes 
already thought to be active in obesity or male infertility, whose effects may be 
advantageously studied in transgenic animals according to the invention due to the 
predictable development of their phenotype. Such genes and their products include those 
mentioned in above in relation to alternative obese strains. Such derived animals in which 

15 the obesity phenotype is present in full or in a modified form, may also be useful for the 
various applications outlined above. 

Animals bearing 5 'OT-EST or mutants thereof and exhibiting a specific late-onset visceral 
obesity may prove of particular value when used in a similar way to screen for the 

20 beneficial effects of reducing or eliminating other gene products by their silencing or 
elimination as described above using transgenesis, or homologous recombination, or by 
adenoviral delivery of antisense nucleotides. Examination of any alterations in the 
occurrence, development, course, severity, or progression of the obesity phenotype in 
these genetic backgrounds would be of utility in identifying the role, if any, of such 

25 disrupted genes in the expression of the obesity phenotype. Such animals in which the 
obesity phenotype is present in full or in a modified form, may also be useful for the 
applications outlined above. 

In another embodiment, animals bearing 5 'OT-EST or mutants thereof or material derived 
30 from them and/or their nontransgenic littermates may prove useful in experiments 
designed to identify obesity -related or male-fertility-related differences in gene 
expression, RNA transcripts, proteins, or other biochemical measures, such as, but not 
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limited to lipids, peptides, carbohydrates, amino acids or compounds or precursors or 
metabolites thereof, or their distribution, in whole animals, or in samples of biological 
fluids taken from animals bearing 5 'OT-EST or mutants thereof These may include, but 
are not limited to: serum, plasma, lymph fluid, synovial fluid, follicular fluid, seminal 
5 fluid, amniotic fluid, milk, whole blood, urine, cerebrospinal fluid, saliva, sputum, tears, 
perspiration, mucus, tissue culture medium, tissue extracts, and cellular extracts. 

Similar analyses may be advantageously performed in samples of any tissue from animals 
bearing 5 'OT-EST or mutants thereof or their non-transgenic littermates, or tissue derived 

10 from animals interbred with SLOB rats or other animals bearing 5 'OT-EST or mutants 
thereof Such tissues are preferably (but not limited to) endocrine tissues, such as 
pancreas, adrenals, or pituitary gland , adipose tissues from different locations, preferably 
but not limited to, inguinal, omental, perirenal, subcutaneous, mammary, periorbital or 
other regions, thermogenic fat, brown or white adipose tissue in other locations, areas of 

15 the CNS though to be involved in obesity, preferably but not limited to the hypothalamus, 
and other tissues, preferably, but not restricted to liver, gastrointestinal tract, gonads, 
heart, musculoskeletal system, immune system, kidney, connective tissue including skin, 
epithelial or endothelial tissues. 

20 Specifically included are cells or tissues removed from animals bearing 5 'OT-EST or 
mutants thereof, or animals interbred with them, and maintained thereafter ex vivo, e.g. in 
tissue culture, or by transplantation in animals bearing 5 'OT-EST or mutants thereof or 
other hosts, with or without immune suppression, provided the particular utility is 
enhanced by the presence of the obesity gene or phenotype. 

25 

The invention thus provides the use of a tissue derived from a transgenic animal according 
to any one of claims 17 to 24 in a screen to identify a genetic cause of obesity, comprising 
the steps of; 

a) isolating one or more gene products from tissue derived from a transgenic animal 
30 according to any one of claims 17 to 24; and 

b) determining whether the expression of a gene product is correlated with obesity. 
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Tissues derived firom SLOB rats, including in particular fat pad tissue, may for example 
be used for differential screening in order to determine differences in gene expression 
between obese and non-obese animals. The gene products analysed may be nucleic acid 
or protein gene products. For example, mRNA may be isolated from the tissue and 
5 screened to identify differentially expressed transcripts. In particular, gene products 
which may be involved in cellular lipid transport may be identified by such means. 

The development of obesity or male infertility itself in animals bearing 5 'OT-EST or 
mutants thereof is predicted to induce secondary changes in other obesity or fertility - 

10 related parameters and regulators. These include, but are not limited to, blood pressure, 
pituitary hormones, sperm development, maturation, and/or motility, lipid mobilising 
enzymes or receptors, or agents controlling these. The latter include, but are not limited 
to leptin and its receptors, melanocortin, NPY, catecholamines, adrenal or gonadal or 
pituitary hormones, gut hormones such as insulin and glucagon, growth hormone and 

15 other growth factors such as members of the GH and IGF-1 families, their binding 
proteins and receptors. They may also include drugs of several classes that have be 
thought useful in obesity. Examples of such classes include agents affecting the serotonin 
system or the fat cell free fatty acid uptake or release or metabolism or lipase activity or 
hepatic lipid uptake, or insulin sensitisers. This example may also include morphological 

20 alterations in any tissue or cells of the cardiovascular system, including but not limited to, 
the heart and major blood vessels, other blood vessels carrying either arterial or venous 
blood, and any or all cells comprising these tissues. 

Transgenic animals according to the invention appear to present with obesity without 
25 obvious diabetes or hypercortisolism. They may thus prove particularly beneficial in 
studying the developmental changes in these secondary parameters induce by other 
means, in the development of diabetes, or hypertension or cardiovascular disease or 
hypercortisoHsm (Russell et al, 1993), all which are known to be associated with obesity 
in humans (Reaven, 1988). Examples of such means includes (but is not limited to) 
30 variation in dietary components or quantity, or treatment with diabetogenic agents, such 
as GH or Cortisol. Examples of such agents affecting cardiac output or peripheral 
resistance or blood pressure include angiotensin-converting enzyme inhibitors or cardiac 
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glycosides, or beta adrenergic receptor 3 agonists or antagonists. 

Morphological changes may also be seen in adipose tissues or cells, or the other tissues in 
the body containing fat, such as the liver and related cells, or the skeleton, or in other 
5 organs or tissues in the gonadal system, relating to the effects on male fertility. 
Differences in these measures detected specifically in animals bearing S'OT-EST or 
mutants thereof and their alteration by elimination, blockade, endogenous stimulation, or 
exogenous administration of anti-obesity or other agents affective in obesity or male 
infertility or related disorders would provide novel approaches to evaluate improve and 

10 perfect existing or novel therapeutic approaches to obesity or male mfertility in other 
animals of commercial importance, and more preferably, in hxmians. An obvious example 
is the ready source of adipocyte cells and products from specific fat depots that are 
differentially increased in transgenic animals according to the invention. Responses to 
agents affecting fat cell metabolism or fat storage or lipogenesis or lipolysis or lipid- 

15 lowering agents, may be studied with particular advantage to discem effects on visceral 
or peripheral fat tissues, and to seek differential effects on fat from different depots in the 
animals. 

The information disclosed herein will enable those skilled in the art to produce protein or 
20 peptide fragments corresponding in sequence to 5'OT-EST or mutants thereof, as 
described above. Such proteins or peptides (or simple analogues thereof), when 
administered to rats or other animals, or more preferably humans, would be expected to 
affect the development of obesity and fertility in males, and serve as the basis for the 
development of similar compounds, based on the homologous human sequences, useful in 
25 the treatment of these conditions in humans. 

This also includes simple analogues that incorporate alterations known to improve the in 
vivo stability or delay the clearance, elimination or metabolism of proteins or peptides 
such as those derived from 5 VT-EST or mutants thereof Such alterations are obvious to 
30 those skilled in the art, and examples include amidation or acetylation of C or N-termini 
of peptides, or replacement of methionine residues with norleucine residues to avoid 
oxidation. This example also includes formulations or modifications of proteins known 
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to be effect for the same purposes, e.g. by PEGylation to prolong the half-life of peptides 
or proteins, or formulations of proteins with inert carriers (such as mannitol or lactose) or 
buffers or salts, that provide stable solutions suitable for in vivo administration of the 
active agents to animals or to humans. 

5 

In another embodiment, the information disclosed herein will enable those skilled in the 
art to design nucleotide probes for, or develop polyclonal or monoclonal antibodies 
against, the DNA, RNA or protein sequences corresponding to the whole or parts of the 
5 'OT-EST gene in other animals of commercial importance, or more preferably, humans. 

10 These are of value in diagnostic tests to screen for mutations in this gene in animal or 
human populations subject to variations in obesity or fertility. They may also be used to 
monitor the development, progression, amelioration or cure of obesity or infertility as may 
be reflected in changes in the activity of this gene or its products. Such predictive tests are 
recognised to have beneficial value when applied to the human population (Whitaker et 

15 aL, 1997) 

Examples of such probes or peptides or proteins used to develop antibodies include those 
predicted from the wild-type and mutated sequences in the rat 5 'OT-EST gene and 
mutants thereof, as well as their derivatives as described above, as well as those that may 

20 be inferred fi-om homologous genes in human and mouse, either as intact sequences or 
formed in whole or in part as fusion sequences with other proteins to facilitate production 
or purification by standard methods known to those skilled in the art. For this purpose, 
products of the 5 'OT-EST gene may also be reacted with, produced as fusion products 
with, or mixed with, other proteins or adjuvants known to enhance tlie immune response. 

25 Also included are modifications to the nucleotide sequences, already known to those 
skilled in the arts to confer useful chemical properties on the products, for example by 
incorporating modified nucleotides to render them more stable, or to incorporate 
nucleotides tagged with functional groups such as biotin or digoxigenin, or incorporation 
of fluorescent or radioisotopically labelled derivatives, in order to render the products 

30 themselves more readily detectable. 

In a further embodiment, the information disclosed herein will enable those skilled in the 
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art to isolate factors which interact directly with 5'OT EST or mutants thereof. For 
example, two-hybrid screens provide a means for isolating genes for proteins which 
interact with 5'OT EST or mutant proteins, their fragments or derivatives. Similarly, co- 
precipitation studies using antibodies directed against 5'OT EST or mutants thereof, 
5 produced as outlined herein, might allow identification of such interacting factors. Such 
factors, when administered to rats or other animals, or more preferably humans, would be 

expected to affect the development of obesity and/or male infertility in a similar fashion to 

that seen in transgenic animals, and would therefore be predicted to have similar uses. 

The use of transgenic animals or materials derived from them, or from the information 
10 disclosed herein, has particular utility in providing a specific means of isolating such 

factors that interact directly with the novel gene products disclosed herein, as well as in 

screening their biological activities in vivo. 

The invention is described further, for the purpose of illustration only, in the following 
15 examples. 

MATERIALS AND METHODS 
Bacterial cultures. 

20 All media are made with double distilled w^ater and autoclaved prior to use. Liquid 

cultures of bacteria are incubated with shaking at 37 °C in either LB broth or terrific 
broth. Bacterial colonies are grown on agar plates made with either LB broth or terrific 
broth with 15g/l bacto-agar. These media are supplemented with combinations of 20|ig/ml 
or 50^g/ml ampicillin, 20^g,/ml tetracycline and 0.2% glucose. Bacterial clones are stored 

25 at - 80 ""C after the addition of 1 5% glycerol. 

Purification of nucleic acids. 

Aqueous solutions containing DNA are purified by vortex mixing with an equal 
volume of phenol;chloroform:isoamyl alcohol (25:24:1). The emulsion is then centrifiiged 
30 at 12,000 rpm for 5 minutes in a microfiige at room temperature. DNA is precipitated by 
adding 3M sodiimi acetate (pH 5.2) to a final concentration of 300mM and two volumes 
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of absolute ethanol. The samples are frozen before centrifugation, the supernatant 
removed and the pellet resuspended in lOmM Tris.HCl pH 8, ImM EDTA (TE buffer). 

DNA preparation from bacteria stocks. 

5 The alkaline lysis method of DNA isolation may be used (Bimboim and Doly, 

1979; Sambrook et al., 1989) to prepare plasmid DNA from small volumes of bacterial 
cultures, typically 10ml. For large scale preparation of plasmid and cosmid DNA, DNA 
may also be prepared from 1 L overnight cultures by the alkaline lysis method. DNA is 
dissolved m lOOmM Tris.HCl pH 8, ImM EDTA and further purified on a caesium 
10 chloride gradient which is centrifuged at 55,000rpm overnight (Sambrook et al, 1989). 

Preparation of genomic DNA from animal tissue. 

Rat tail biopsies, up to 1 cm, are taken from 10-14 day old rats and placed in 
50mM Tris.HCl pH8, lOOmM EDTA, lOOmM NaCl ('tail mix'). Genomic DNA is 

15 prepared following a standard procedure (Hogan et al 1986) involving incubation with 
proteinase K, RNase A, phenol extraction and precipitation with isopropanol. Genomic 
DNA from other tissue such as liver may be prepared by the same method, though this 
requires additional homogenisation in a larger volume (typically 5ml) of tail mix using a 
Kinematica Polytron PT 3000 homogeniser prior to the preparation of DNA from a 

20 smaller aliquot of homogenate. 

Restriction digestion of DNA. 

Restriction enzyme digestion is performed using standard procedures in 
accordance with manufacturers instructions. EnzjTiies are sourced from Boehringer 
25 Mannheim, Cambio, or New England Biolabs. Plasmid DNA is incubated for up to 4 
hours whilst genomic DNA digests may be incubated overnight. 

Subcloning DNA fragments into plasmid vectors. 

Blunting of DNA fragments with a 3 ' overhang 
30 After digestion of DNA with a restriction enzyme which leaves a 3' overhang, the 

overhang may be removed by incubation with T4 DNA polymerase to create a blunt end 
for ligation with other blunt-ended DNA fragments. The digests are phenol-extracted, 
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ethanol-precipitated with the addition of lOjig tRNA, and resuspended in TE. MgCl2 and 
deoxynucleotide triphosphates (dNTPs) are added to final concentrations of lOmM and 
0.1 mM respectively prior to the addition of 2 units of T4 DNA polymerase (New England 
Biolabs). The reaction is incubated for 15 minutes at 12 ^C. The polymerase is inactivated 
at 75 °C for 10 minutes before purification. 

Blunting a DNA fragment by refilling the 5 ' overhang. 

DNA fragments with 5' overhangs may be blunted by filling in the single stranded 
ends. This may be done using the Klenow fragment of E, coli DNA polymerase I (New 
England Biolabs). The DNA is digested with an appropriate restriction enzyme, phenol 
extracted, ethanol precipitated with the addition of \\ig of tRNA and resuspended in TE. 
lOx Klenow buffer (0.5M Tris.HCl, pH7.6, O.IM MgCh) and dNTPs to a fmal 
concentration of Ix and 0,2mM respectively are added with 10 units of Klenow fragment. 
The reaction is incubated at 37 °C for 30 minutes prior to purification of the DNA. 

Vector dephosphorylation, 

Calf alkaline phosphatase (CAP) may be used to remove 5' phosphate groups from 
digested vectors to prevent self-ligation during subcloning. Plasmid and cosmid vectors, 
linearised with restriction enzymes, are incubated with 2 units of CAP (Boehringer) in 
50mM Tris.HCl, pH 8.5, 50 mM EDTA, for 30 minutes prior to purification. 

Inserting linkers into DNA fragments. 

Digested plasmid DNA may be blunted if necessary, phenol-extracted, ethanol- 
precipitated with the addition of l|j,g of tRNzA. and resuspended in TE. 0.5-l|ig of 
phosphorylated linkers are ligated to linearised, blunt-ended plasmid DNA. Ligations are 
performed m a final concentration of Ix ligase buffer (SOmM Tris.HCl (pH 7.5), lOmM 
MgCb, lOmM dithiothreitol, ImM ATP, 25^g/ml bovine serum albumin (BSA), 0.5mM 
spermidine-HCl with the addition of 400units of T4 DNA ligase (New England Biolabs). 
The reactions are incubated at room temperature overnight. The enzyme is then 
inactivated at 65 °C for 15 minutes. The linkered fragments are digested with an excess 
amount of the appropriate restriction enzyme and the DNA purified prior to further 
subcloning procedxxres. 
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Electrophoresis of DNA fragments 

DNA j&agments may be electrophoresed in gels of varying percentages of agarose 
in 90mM Tris-borate, 2mM EDTA, pH 8.0 (TBE buffer) containing 0,5ng/ml ethidium 
5 bromide. The DNA bands are visualised on an ultraviolet transilluminator and 
photographed. Size markers may be Lambda DNA digested with Bst EH, pUC19 DNA 
digested with Msp I, or a commercially available Ikb ladder (Gibco-BRL). 

Purification of DNA fragments. 

10 Digested, blunted, dephosphorylated or linkered DNA fragments are 

electrophoresed in low melting-point agarose. Gel bands are excised, melted at 65 °C for 
5 minutes, and extracted twice with phenol/0.3M NaOAc. Following a phenol extraction 
and ethanol precipitation with the addition of l^ig tRNA, the DNA is recovered by 
centrifugation and resuspended in TE. Altematively, DNA fragments may be purified 

15 from agarose using the Prep-A-Gene DNA Purification System (Bio-Rad Laboratories) 
according to manufacturers instructions. 

Purification of larger cosmid-containing fragments. 

Large vectors are digested and treated with 50 units of CAP for in excess of 3 
20 hours. EDTA and SDS are added to final concentrations of 5mM and 0.5% respectively. 
The phosphatase is denatured for 1 hour at 65 °C and the solution is phenol-extracted, 
ethanol precipitated with the addition of l(ig tRNA, and the DNA recovered is 
resuspended in TE. 

25 Ligation of DNA fragments into phosphatased vectors. 

After purification, DNA fragments and vectors are mixed at equimolar ratios at an 
approximate concentration of up to 80ng/ml, whilst DNA for recircularisation is used at a 
concentration of 20ng/mL Ligation may be done in a volume of 5(il with 200 units of T4 
DNA ligase (New England Biolabs) in ligase buffer. Two control reactions are performed 
30 simultaneously omitting the insert alone, or omitting both insert and ligase. Ligations are 
incubated ovemight at 16 °C. 
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Preparation of competent cells. 



psi-broth 5g/l bacto-yeast extract, 20g/l bacto-tryptone, 

5g/l MgS04, adjusted to pH 7.6 with NaOH. 

5 

TFbl 30mM KAc, lOOmM KCl, lOmM CaC^, 50mM 

MnCli, 15% glycerol (v/v), adjusted pH 5.8 
with acetic acid and filter sterilised. 

10 TFbn lOmM PIPES, 74mM CaCh, lOmM KCl, 15% 

glycerol (v/v), adjusted to pH 6.5 with acetic 
acid and filter sterilised. 



Competent cells yielding a transformation fi-equency >5xlO transformed colonies 
15 per |ig of supercoiled plasmid DNA may be prepared by a method modified fi-om 
Hanahan et al. (1983). Bacteria of the strain DHIOB (Grant et ai, 1990) are plated on an 
agar plate and grown ovemight at 37 °C. 10ml of psi-broth is then inoculated with 4 
colonies from this plate. The bacteria are then shaken at 37 °C until OD550=0.3. 

20 5 ml of this broth is then diluted into 100 ml psi-broth and shaken until OD550=0.28. The 
flask is then placed on ice, the bacteria are centrifuged at 4 °C for 15 minutes at 2,000 
rpm. The supernatant is removed and the pellet allowed to dry briefly before being 
resuspended in 20ml TFbl. This suspension is left on ice for 5 minutes and then 
centrifuged at 2,000 rpm for 10 minutes at 4 °C. The supernatant is then removed and the 

25 pellet resuspended in 3 ml TFbll and placed on ice for 15 minutes, Aliquots are then 
frozen on dry ice and stored at -80 **C. The competence of the cells may be tested by 
transforming plasmid DNA of known concentrations. 

Transformation of competent cells. 

30 Competent cells are thawed on ice before 50\il of cells is added to each ligation. 

These tubes are then incubated on ice for 30 minutes. The cells are then subjected to 
heatshock at 42 °C for 90 seconds before being placed on ice for 2 minutes. 0.4ml of LB 
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broth is added and the culture is shaken at 37°C for 1 hour. Cells are then incubated 
overnight at 37 °C on agar plates containing the appropriate antibiotic. Single colonies are 
picked with a flamed wire loop and used to inoculate 1 0ml of media for minipreparation 
of plasmid DNA. 

5 

Packaging of Cosmid DNA into bacteriophage particles. 

Cosmid constructs may be packaged into bacteriophage particles using Gigapack 
n packaging extracts (Stratagene) and E. coli strain DHIOB is infected in accordance with 
manufacturer ' s instructions . 

10 

Southern blotting. 

DNA (10|ig of genomic DNA or 0.5 (ig of plasmid DNA) is digested with 
appropriate restriction enzymes and electrophoresed on agarose gels with marker DNA of 
known size. After photography, gels are treated as described by Sambrook et al (1989) 
15 and the DNA transferred from the gels onto nitro-cellulose filters by the capillary transfer 
method (Southern, 1975; Sambrook et al, 1989) and these are then baked for 2 hours. 

Radiolabelling of DNA fragments for Southern blots. 

DNA probes are obtained by gel purifying appropriate fragments from restriction 
20 digests of subcloned DNA. The DNA is denatured by being incubated for 3 minutes in a 
boiling water bath. The resulting single-stranded DNA fragments are radiolabelled with 
""^^PdCTP, e.g. by the random primer labelling kit, Prime-It II supplied from Stratagene, 
in accordance with manufacturer's instructions. The labelling reaction is halted by the 
addition of TES buffer to a final concentration of lOmM Tris.HCl (pH 7.5), lOmM 
25 EDTA, 0.1% SDS. Radiolabelled DNA probes are separated from unincorporated 
nucleotides by eluting through a column containing Sephadex G-50. 

Hybridisation of Southern blots. 

30 lOOX Denhardts solution 2% BSA, 2% Ficoll 400, 2% 

Polyvinyl PyroUidine. 
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352 ml lMNa2HP04, 648ml lMNaH2P04. 

O.lmg/mltRNA, 5x SSC 

50mM Na Phosphate buffer (pH 6.6) 

lOx Denhardts solution, 1% SDS. 



Hybridisation mix Prehybridisation mix with the above 

described radiolabelled DNA probe, 

10 Filters from Southern blotting are gently shaken at 65 ""C in prehybridisation mix for a 
minimum of 2 hours. This solution is then replaced with hybridisation mix and incubated 
overnight. The filters are washed in varying concentrations of SSC with 0.1% SDS for 
varying amounts of time dependent on the DNA probe being used. Filters are then dried 
and placed between two intensifying screens at -70 °C with Kodak " Xomat-AR" fihn. 

15 

Screening a rat cosmid library 

Duplicate filters from a Wistar rat cosmid library containing genomic DNA inserts in the 
pWE15 cosmid vector (Wahl et al 1987) are prehybridised and hybridised with probes 
for the rat OT and AVP genes. Following overnight hybridisation, filters are washed with 

20 3x SSC/0.1% SDS for 20 minutes and then SSC/0.1% SDS twice for 20 minutes. Filters 
are briefly washed in 2x SSC, dried and autoradiographed. Duplicate hybridisation signals 
are aligned with the master fihers and bacteria are picked, placed in media and left to 
diffuse. The resulting cultures are grown on terrific broth agar with 20|ag/ml ampicillin 
and replica plated. Replica plating of the bacterial culture fi'om the library screening may 

25 be performed as previously described (Sambrook et al, 1989). Replica filters are 
prehybridised and hybridised as above. Positively-hybridising colonies are picked from 
the master filters and grown in larger volumes of ampicillin-supplemented media for 
minipreparation and southem blot analysis of the cosmid DNA. 



30 



Purification of DNA for microinjection. 

50-lOO^ig of DNA is digested with Not I to separate the cV014 cosmid insert 
from vector DNA. A salt gradient may be used as described by Dillon et al (1993) to 
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purify the 44kb fragment. A gradient former is used to pour a gradient ranging in NaCl 
concentration from 5-25%. The digested DNA is applied to the top of the gradient which 
is then centrifuged at 5.5 hours at 37,000 rpm. The solution is then removed in 500|il 
aliquots which are examined by electrophoresis. Fractions containing the fragment to be 
5 microinjected are pooled and ethanol precipitated. The pellet is dissolved in 
microinjection buffer (lOmM Tris.HCl, pH 7.5, O.lmM EDTA). DNA may be purified 
further using an Elutip column (Schleicher and Schuell) according to manufacturers 
instructions. cV014 DNA at a concentration of l-lOng/jal, typically 2ng/|il, is used to 
generate transgenic rats. 

10 

Superovulation, microinjection and embryo transfers. 

40 day old prepubertal female Wistar rats are given intraperitoneal injections of 30 
lU pregnant mare's serum (FoUigon, Intervet Laboratories Ltd) between 9 and 1 1 o'clock 
on day -3. The same rats are injected i.p. at midday on day -1 with 22.5 lU human 

15 chorionic gonadotrophin (Chorulon, Intervet Laboratories Ltd) and placed in a cage v^th a 
stud male of the same strain. On day 1, females are killed and their oviducts removed and 
placed in M2 media (Hogan et al, 1986). The oviducts are dissected to release the eggs 
which are subsequently placed in M2 media with 0.5mg/ml hyaluronidase (Sigma) in 
order to remove the cumulus cells surrounding the eggs. After 5 minutes the eggs are 

20 removed from the hyaluronidase solution, washed thoroughly in M2 and placed in the 
unbuffered Ml 6 (Hogan et al, 1986) in a 37 incubator supplemented with 5% CO2. 
After 2 hours of incubation the male pronuclei of the eggs are microinjected using 
standard procedures and equipment (Hogan et al, 1986). Microinjected eggs are 
incubated overnight at 37 ^C. 

25 

The following day (day 2), eggs which have divided to the two-cell stage are washed in 
M2 media and transferred into the oviducts of pseudo-pregnant adult Wistar rats which 
have been mated with vasectomized male rats the previous night. The surgery is 
performed under halothane anaesthetic, with between 15 and 20 eggs being transferred 
30 into each infundibulum. Resulting litters are tail-clipped at 2 weeks of age. The tails are 
used for DNA preparation as described above and analysed by Southem blotting for 
animals containing transgenes also as described above 
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RNA preparation. 

RNA may be prepared from rat tissue by the acid guanidixim thiocyanate-phenol- 
chloroform extraction method (Chomczynski et al, 1987). Briefly, tissue is homogenised 
5 in 500fxl 4M guanidium thiocyanate, 25mM Sodium citrate (pH 7.0), 0.5% (w/v) sodium 
N-lauroylsarcosine, lOOmM 2-mercaptoethanol prior to the addition of 33|il 3M sodium 
acetate (pH 4.1), 500^1 phenol and 100^1 chloroform. The mixture is vortexed and placed 
on ice for 15 minutes before centrifugation at 12,000rpm for 10 minutes at 4 ""C. The 
aqueous fraction is decanted into a fresh tube and precipitated with isopropanol. 

10 

In vitro transcription, 

A plasmid containing a T7 polymerase promoter 5' to the inserted sequence is 
linearised with a restriction enzyme which cuts at the 3' end of the insert. Transcripts are 
then obtained of the subcloned fragment using a T7 transcription kit (Boehringer 
15 Mannheim) according to the manufacturer's instructions 

DNA sequencing. 

Sequencing of DNA plasmid subclones may be performed manually with the 
Sequenase version 2.0 sequencing kit (United States Biochemicals) which employs the 
20 chain-termination method (Sanger et al 1977), or by automated sequencing using an ABI 
Prism DNA Sequencing Kit and 377 DNA Sequencer (Perkin Elmer Applied Biosystems) 
according to manufacturer's instructions. 

Reverse Transcription of RNA. 

25 RNA may be converted to cDNA using Superscript E reverse transcriptase 

(GibcoBRL) according to the manufacturer's instructions, in combination with either an 
oligo dT primer or another specific primer complementary to the RNA sequence of 
interest. 

30 Polymerase Chain Reaction amplification of DNA. 

The polymerase chain reaction (PGR) may be used to amplify fragments of DNA 
using 50|al of a reaction mix which contains lOmM Tris, pH8.3, 20mM KCl, 0.2mM 
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dNTPs, 200nM primers, 50-250ng template DNA, 2.5units Amplitaq DNA polymerase 
and l-3mM MgCl2 (the optimal conditions for each amplification are determined 
empirically). Conditions vary for each template target, but a typical amplification might 
be to place the reaction mix in a thermal cycler (MJ Research Inc.), denature for 2 minutes 
5 and then subject the reaction to 34 cycles of 94 ''C for 1 minute, 58 °C for 1 minute and 72 
°C for 1-5 minutes, depending on the length of the expected product. 

Cloning of PCR products. 

Products generated by PCR may be cloned using a TOPO TA Cloning kit 
10 (Invitrogen) according to the manufacturer's instructions. 

Nuclease Protection assay, 

Riboprobe transcripts incorporating ^^P may be generated by transcribing 
approximately 250ng of DNA fragment in Transcription Optimised buffer (Promega), 

15 500mM ATP, OTP and CTP, 20 units Rnasin RNA inhibitor (Promega), 20mM UTP, 
100|j.Ci """^^PUTP (Amersham) and 20 units of the appropriate RNA polymerase 
(Promega) at 37 ""C for 90 minutes. After treatment of the reaction with 2 units of DNase I 
(Promega) at 37 °C for 20 minutes, the reaction is denatured and pxirified by 
polyacrylamide electrophoresis, followed by excision of the labelled RNA and elution in 

20 350)al of Elution buffer (Ambion) for 2 hours at 37 °C. Nuclease protection may be 
performed essentially as described by Lee and Costlow (1987) using ^^P-labelled 
riboprobe with l-20)j,g total RNA and the RPAII Ribonuclease Protection Kit (Ambion) 
according to the manufacturer's instructions. 

25 In Situ hybridisation. 

Sense and anti-sense riboprobe transcriptsmay be generated using an SP6/T7 
transcription kit (Boehringer) with ^^S-UTP (NEN Research Products) according to the 
manufacturer's instructions. In situ hybridisations may be performed as described in 
Bennett et al (1995). Autoradiographs are analysed densitometrically, from a light box 
30 using a video camera linked to a Power Macintosh 7600/132 running the programme NIH 
hnage version 1.61. 
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Immunocytochemistry 

Human growth hormone (hGH) may be localised in pituitary and brain sections 
using a modified avidin-biotin complex immunocytochemistry technique (Bourne et al., 
1984). Tissue is collected and fixed in 4% paraformaldehyde for 24 hours. Tissues may be 
stored at 4 °C in 70% ethanol before embedding in paraffin wax and sectioning. Tissue 
sections (6|im) are dewaxed in Histoclear (National diagnostics) and rehydrated by 
sequential 20 sec washes of 100%, 70 % and 30 % ethanol followed by a 1 min wash in 
distilled H2O. Endogenous peroxidase activity is inhibited by a 30 min incubation in 3% 
(v/v) hydrogen peroxidase in methanol. Sections are then washed in distilled H2O for 1 
min before being treated with 0.1% (w/v) trypsin (Sigma) for 15 min at 37 °C followed by 
0.5% (v/v) Triton X-100 (Sigma) for 15 mins. After two 5 min washes of distilled H2O 
and 0.05M Tris buffered saline (pH 7.6), 0.1 5M NaCl (TBS) the sections are incubated 
with 20% (v/v) normal rabbit serum (DAKO) with 5% (w/v) BSA for 30 mins in order to 
reduce non-specific backgroimd staining. The sections are then incubated ovemight in a 
humidity chamber at 4 ^'C with an antibody specific for hGH, such as sheep anti-hGH 
primary antibody (1 :3 0,000) (Scottish Antibody Production Unit). 

Following two washes in TBS, sections are incubated with biotinylated rabbit anti-goat 
serum (DAKO, 1:200) for 30 mins. The sections are again washed in TBS and incubated 
for 30 mins with avidin complexed to biotinylated horse radish peroxidase (DAKO). 
Human GH immunoreactivity is then visualised by development using 3,3- 
Diaminobenzidine tetrachloride/hydrogen peroxide (DAB) (4mg/10ml in 0.05M Tris 
buffer pH 7.6) containing 3% (v/v) H2O2. This reaction is quenched in distilled H2O prior 
to counterstaining with Gill's haematoxylin (BDH) and coverslipped for microscopic 
examination. 

Radioimmunoassays 

Tissue samples are homogenised in varying volumes of phosphate buffered saline 
with either glass homogenisers (for volumes up to 1ml) or a Kinematica Polytron PT 3000 
homogeniser (for larger volumes). The same polyclonal sheep anti-hGH antibody 
(Scottish Antibody Production Unit) may be used to measure hGH in tissue extracts by 
RIA as described in Fairhall et al. (1992) using recombinant hGH as standard. AVP may 
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be measured by RIA as described in Horn, Robinson & Fink (1985). Rat GH may be 
measured as in Charton et al, (1988). Bovine neurophysin may be measured by RIA 
using a specific antiserum that does not recognise rat neurophysins (Gordon Weeks, 
1987). Rat leptin and rat insulin may be measured by specific RIAs using kits fi-om Linco 
5 Research Inc, following the manufacturer's instructions. Corticosterone may be measured 
by a double antibody RIA kit obtained fi-om ICN Biomedicals. Cholesterol and 
triglycerides in blood samples may be measured using kits obtained from Sigma 
Diagnostics ('Cholesterol 20' and 'Triglycerides, UV'). Plasma glucose values may be 
measured using a Beckman glucose analyser. 

10 

EXAMPLE 1 

ISOLATION OF COSMID DNA, CONSTRUCTION OF TRANSGENE COSMIDS 
AND GENERATION OF TRANSGENIC RATS 

15 Isolation of cosmid DNA 

Since the DNA sequences of the rat AVP and OT genes and their orientation and 
structural relationship to each other in the rat genome are known (Ivell & Richter, 1984a; 
Mohr et al 1988; Schmitz et aL, 1991) the size of restriction fragments which should be 
detected with cDNA probes for these genes can be predicted. Colonies bearing rat DNA 

20 which contained fragments hybridising to these OT and AVP probes in the same areas in 
duplicate filters from the cosmid screening are aligned with the original bacterial plates. 
These colonies are picked and grown and DNA prepared from them, digested with Hind 
in, run on agarose gels, Southem blotted and hybridised to the same OT and AVP probes 
agam. Three positive colonies are chosen for further analysis because their differing 

25 restriction fragment patterns indicated that they spanned different regions of the rat 
AVP/OT locus. From Southem blotting of restriction digests using probes against the 
first exon of each gene and the vector, they are found to span a total of 44kb, including 
both genes, the 1 Ikb intergenic region, 8kb of AVP 5' flanking sequence and 24kb of OT 
5' flanking sequence. These three overlapping cosmids are designated cVOl, 2 and 3. An 

30 overall schematic map of this region, indicating the location and orientation of AVP and 
OT genes, the location of some important restrictions sites, and areas of sequence known 
or subsequently determined and disclosed here, is shown in Figure 1 
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To facilitate further restriction mapping of the 5' flanking sequence of the rat OT gene, 
8kb and 14kb Sma I fragments and an 8.5kb Kpn I fragment are subcloned into cloning 
vectors and subjected to further restriction mapping. Smaller fragments of the OT and 
5 AVP genes are also subcloned into pUC 19 derived plasmid vectors and used to remove 
restriction enzyme sites and to insert the reporter genes into the rat OT and rat AVP loci. 
Oligonucleotide linkers containing sequences for unique restriction sites are also inserted 
in the 5' untranslated regions of the two genes to allow for future modifications of this 
construct. 

10 

The subcloning strategy used for the AVP locus is outlined in Figure 2 Essentially, the 
aim is to insert a genomic hGH reporter fragment in a unique cloning site introduced into 
the 5' imtranslated region of rat AVP gene. Swanson et al (1985) had previously shown 
that hGH reporter transcripts, when fortuitously expressed, may be expressed and 

15 translated efficiently in these neuronal cells types. Furthermore, hGH nucleotide 
sequences can be differentiated from rat GH sequences by specific nucleotide probes 
(Seeburg et al, 1911 \ Roksam & Rougeon, 1979) and the protein can be differentiated 
from rat GH by specific antisera (Appendix 1). An Mlu I linker is initially inserted into a 
smaller subclone of the rat AVP gene, replacing the Dra HI site in the 5' untranslated 

20 region. A genomic fragment of the human GH structural gene (Roksam & Rougeon, 
1979) is then inserted as an Mlu I-linkered fragment spanning from the 5' untranslated 
region of the hGH gene to a region 3' of the last exon and containing all 5 exons and 4 
introns of hGH. This AVP-hGH fragment is inserted as a 12.2 kb Cla I-Xho I fragment 
containing 450bp 5' and 8kb 3' of the transgene, with deletion of other Xho I restriction 

25 sites. 

The subcloning strategy used for the OT locus is outlined in Figure 3. In this case the 
aim is to replace most of the rat OT structural gene with corresponding bovine sequences 
(Land et al, 1983). The protein produced should function identically, but the bovine 
30 sequences would provide a 'silent' reporter since they could be differentiated from rat 
sequences (Mohr et al, 1988) by specific nucleotide probes, and the protein differentiated 
from rat neurophysin by specific antisera. Rat neurophysin has previously been used as a 
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transgene reporter in mice (Belenky et aL, 1992). Due to the constraints of suitable 
restriction sites it is necessary to assemble a 5' construct of the hybrid rat/bovine OT gene 
(containing exon A and most of exon B) and a 3' construct (containing a small fragment 
of exon B and exon C) separately. These constructs are joined to produce the hybrid gene 
with the 5' and 3' flanking sequences being added in subsequent steps. A Sal I linker is 
also inserted immediately 5 ' to the translational start site of the bovine OT to provide a 
unique cloning site within the AVP/OT locus for future modification of the construct. The 
hybrid gene is inserted into the final construct as a 10.5 kb Mun I - Xho I fragment 
containing 7.8 kb of 5' and 1.7 kb of 3' flanking sequence, after deleting other restriction 
sites within the cosmid 

Assembly of the final construct. 

The pWE15 vector is modified to remove the unrequired SV2 neomycin gene. 
This reduced the vector size from 8.5kb to 4.2 kb and therefore increased the size of the 
insert that could be subcloned into the cosmids, which can efficiently package up to 52 kb 
(Wahl et al. 1987). Cla I, Mxm I, Sal I and Mlu I restriction sites are also removed from 
the vector to permit subsequent cloning steps. The pWE15 cosmid vector has Not I sites 
flanking the insert. Restriction mapping of cVOl revealed a Not I restriction site 13kb 
upstream from the rat OT gene (Figure 1) which would also be digested if Not I is used to 
remove the insert. Therefore, a 4.6kb Aat Il-Sca I fragment containing this site is 
subcloned, the Not I site deleted, and the fragments ligated and replaced into the 
construct. Digestion of the ligation product with Not I confirmed that this site had been 
destroyed. 

The modified OT and AVP gene fragments are inserted into cV03, followed by addition 
of the 5' region present in cVOl but not in cV03using an Aat n fragment. This adds all 
except 1.1 kb of the extreme 5* end of cVOl and contains a Not I linker at the extreme 5* 
end. 

The fmal construct, termed cV014, spans 44kb and includes 8kb 5' of rat AVP, 24kb 5' 
of rat OT and llkb of intergenic sequence. The construct has reporter gene hGH 
sequences inserted into the 5' untranslated region of the rat AVP gene and parts of the 
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bovine OT gene sequences substituted for equivalent rat OT gene sequences. The final 
cV014 construct is illustrated diagrammatically in Figure 4. 

5 Generation of transgenic rats bearing the cV014 construct 

The 44 kb Not I insert is released from cV014 by Not I digestion, purified on a 
salt gradient and microinjected into fertilised rat oocytes. These embryos are transferred 
into pseudopregnant mothers and the offspring are analysed for the presence of the 
transgenes. Genomic DNA prepared from tail biopsies of these pups is digested with Bgl 
10 n, Southem blotted and hybridised with a radiolabeled genomic hGH probe that should 
identify 2 predicted fragments of 0.9 and 2.1 kB from transgene DNA. This probe does 
not detect endogenous rat GH sequences. Of 102 pups the hGH transgene is present in 
the DNA of only 3 pups, termed JP 17, JP 19 and JP 59. JP 19 dies at 1 1 days of age, and 
is not analysed further. 

15 

Other samples of DNA from the two remaining rats is digested with Pst I, Southem 
blotted and hybridised with a radiolabelled probe that should identify two predicted 
fragments of 0.9 and 1.6kB from the hybrid rat/bovine transgene sequence, and a single 
2.5kB fragment from the endogenous OT gene. Both JP17 and JP59 rats are also found to 
20 contain this hybrid gene, as well as the endogenous gene, whilst only the endogenous 
fragment is visualised in DNA from non-transgenic rats. 

DNA from JP 17 and JP 59 rats is also Southem blotted and probed with radiolabelled 
DNA fragments corresponding to the ends of the cV014 construct, which confirmed that 

25 whole copies of the microinjected fragment are present in both rats. The copy number of 
the transgenes is estimated by Southem blotting of Hind in fragments and hybridisation 
with a probe for the rat AVP gene sequences, which recognised a 3.4 kb fragment 
corresponding to the endogenous rat AVP gene and a 5.2 kb fragment which represents 
the transgene with its hGH reporter gene insertion. Assuming equal affinity to 

30 endogenous or transgene sequences, phosphorimaging these blots suggested that the JP17 
rats contained at least 4 copies of cV014 whereas JP59 rats had a single copy. The copy 
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number and restriction pattern of the transgenes remained consistent through successive 
generations of breeding, suggestive of a single site of chromosomal integration. 

Further analysis of DNA suggested that the insert contained concatamers of cV014 in 
5 JP17 but not in JP59, and that one concatamer pair contained a truncation which removes 
a fragment of approximately Ikb between 8kb and 7kb 5' of rat A VP. Restriction 
mapping and sequence analysis of the cosmid ends enabled us to design PCR primers 
(PL216 (SEQ. ID. No. 10 and PL210 (SEQ. ID. No. 9)) that uniquely identify DNA from 
JP17 rats bearing this insert, and distinguishes them both from non-trans genie littermates, 
10 and from JP59 rats bearing a single copy of this insert. 

Establishing colonies of transgenic rats. 

The founder JP 17 rat is a male. He sires only single litter of rats at 6 months of age 
although constantly caged with fertile females. This litter contains both male and female 
15 rats bearing the transgene, indicating that the integration has occurred onto an autosomal 
chromosome. No frirther litters are sired by male progeny. Litters bred from transgenic 
JP17 female progeny show an approximate 1:1 ratio of transgenic to non-transgenic rats 
(46 transgenic versus 54 non-transgenic in the first 100 pups) suggesting that the 
transgene does not have a detrimental effect on embryonic viability. 

20 

The founder of the JP 59 line of rats is female and bred normally (the ratio of transgenic 
to non-transgenic pups is approximately 1 :1 (47 transgenic verses 53 non-transgenic in the 
first 100 pups). This single copy integrant is also present on an autosomal chromosome 
since male JP59 rats of this line sire transgenic progeny of both sexes. 

25 

EXAMPLE 2 

ANALYSIS OF EXPRESSION OF EXPECTED TRANSGENE PRODUCTS 

Human growth hormone 

30 The expression of hGH from cV014 is investigated in both JP59 and JP17 rats. 

Immunocytochemistry shows expression of hGH protein in hypothalamic magnocellular 
paraventricular (PVN) and supraoptic nuclei (SON). Human GH immimoreactivity is also 
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transported via axons passing through the internal zone of the median eminence and 
present in the axon terminals in the posterior pituitary. 

In situ hybridisation confirms strong expression of hGH in the PVN and SON of 
5 transgenic rats of both JP59 and IP17 lines. hGH transcripts are also detected in other 
sites of AVP expression in the CNS in JP 17 transgenic rats, such as the medial 
amygdaloid nucleus and the habenula (Buijs, 1987; Gaffe et al, 1987; Urban et ai, 1990). 
Double in situ hybridisation analysis or immunocytochemical analysis confirms that hGH 
expression is localised in AVP neurones, and not in OT neurones. In independent 
10 studies, RT-PCR analysis detects hGH transcripts m hypothalami and pituitaries from 
both lines, and also detected transcripts m the pancreas and also faintly, in adrenals of JP 
17 rats, but not in other tissues tested. These findings are in accordance with previous 
reports of extrapituitary expression of the endogenous AVP gene in these tissues. 

15 Radiounmunoassay for hGH confirms the presence of significant quantities of hGH in 
posterior pituitary extracts from both JP59 and JP17 animals, with larger amounts m JP17 
line consistent with their higher copy number. Small amounts of hGH immunoreactivity 
are also found in the pancreas (0.016 ± 0.0075 ng/mg of tissue, n=3) of 20 week old JP 17 
male rats, though this represents < 0.1% of the amounts of hGH found in the posterior 

20 pituitary extracts of the same rats (168ng ± 16 ng/mg, n=5). Thymus, heart, kidney, fat, 
liver, ovary, uterus, testis, lung, cortex, cerebellum, spleen and adrenals all had 
undetectable levels of this protein (<0.0004ng of hGH/mg of tissue). 

If the hGH transgene is correctly expressed, then stimuli for increased A vT synthesis and 
25 release should increase hypothalamic expression of the hGH transgene and decrease 
pituitary stores of hGH. Chronic osmotic stimulation has been shown to regulate the 
expression of the AVP gene (Lightman et al., 1987; Murphy et al, 1990) and cause a 
release of AVP from the posterior pituitary (Fitzsimmons et al., 1994). The stimulus of 
salt-loading has previously been, used to detect whether the DNA regulatory regions 
30 responsible for physiological regulation of the rat AVP gene are present within 
microinjected constructs (Zeng et al, 1994b; Waller et al, 1996). Groups of non- 
transgenic or transgenic JP59 or JP17 male rats given 2% NaCl w/v in their drinking 
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water for 72 hours both show a marked increase in hGH expression in PVN and SON. 
Furthermore, posterior pituitary hGH content fall significantly in such salt-loaded animals 
in parallel with the fall in AVP content. Samples taken from JP17 rats confirm that hGH 
is secreted, and can be detected in plasma by RIA (1.3 ± 0.09 ng/ml) 

5 

The effects of transgenic expression of hGH to reduce rat GH by feedback have been 
documented earlier in other transgenic rats (Flavell et al,, 1996) Therefore, rat GH 
content of the pituitaries of JP 17 rats and non-transgenic littermates are measured by 
RIA. Rat GH is significantly reduced in both the male and female JP 1 7 transgenics in 
10 comparison to the non-transgenic controls at 23, 77 and 140 days of age. At 140 days, the 
mean pituitary rat GH content of male JP 17 rats is 34.2% of that of the age-matched non- 
transgenics. The pituitary rat GH content is less affected in the female JP 17 rats (57.4% 
of the mean rat GH content of the non-trans genics, p<0.02). 



15 The size of the anterior pituitaries also suggests that there is a reduction in their cell 
number as JP 17 male rats at 140 days have significantly smaller anterior pituitaries than 
non-transgenic controls (4.6 ± O.lmg for JP 17 males versus 8.2 ± 0.5mg for wild-type 
rats, p<0.002, n=6). The pituitaries of JP 59 males and females do not show a reduction in 
rat GH content or size (p>0.55). 

20 

Bovine neurophysin 

No bovine OT-NP protein can be detected in posterior pituitary extracts (<10pg 
per pituitary) from JP17 rats, using a specific RIA that distinguishes bovine neurophysin 
from rat neurophysins (Gordon- Weeks, 1987). An RT-PCR assay for bovine OT-NP 

25 transcripts is applied to hypothalamic extracts of adult males or of lactating female rats 
culled within 24 hours of littering, from both lines. The latter animals are chosen since 
they should show higher levels of endogenous OT expression (Van Tol et al., 1988). 
PGR is performed on cDNA generated by reverse transcribing RNA from various tissues 
of both JP 17 and JP 59 lines (hypothalamus, pituitary, pancreas, ovary, heart, lung, 

30 muscle, thymus, cerebellum, uterus, testis, spleen, kidney, adrenals, liver, cortex). 
Additional reactions with primers for IJ-actin and hGH are also included. The reactions of 
the rat/bovine OT primers with the cV014 construct and in vitro transcribed rat/bovine 
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OT RNA both yielded the correct size fragment (767bp), but no transcripts from the 
rat/bovine OT transgene are detected in any tissue. We conclude that the rat/bovine OT 
portion of the cV014 construct is not detectably expressed in JP 17 or JP 59 transgenic 
animals. 

The lack of expression of the rat/bovine OT transgene may indicate that additional 
sequences lacking from cV014 are required to achieve appropriate OT expression in 
addition to expression from the AVP locus. Other possibilities are that aherations 
introduced into the OT locus prevent expression. This could be in the coding regions of 
the hybrid rat^ovine OT cassette. Another possibility is the introduction of the Sal I 
linker 5' of the OT gene. A fiirther possibility is the presence of a base change in the 
region immediately 3 ' of the TATA-box which is discovered upon sequencing this region 
of c vol 4 (Figure 5). 

EXAMPLES 

DISCOVERY AND ANALYSIS OF 5'OT-ESTAND S'OT-EST -XDEL 

cV014 is noted to contam a CpG island 13kB upstream of OT. Sequencing of 3.3kB of 
this region of the cosmid reveals a potential novel gene. Comparison with EST sequences 
in the public databases revealed partial matches to sequences from rat, human and mouse 
origin. The GenBank accession numbers for such ESTs include: H31114; H31115; 
AA955566; AA850004; AA104183; AA080247; AA245389; AA242211; AA421310; 
AA505752; AA421393. Such searches also reveal a partial match to a human genomic 
sequence GenBank Accession number AF036329. From comparisons with these 
sequences and the rat genomic DNA sequence disclosed herein, it is predicted that the 
novel rat gene in cV014 contains four open reading frames, termed w, x, y, z. This gene 
is termed 5'OT-EST. The genomic DNA sequence and predicted exon structure is 
disclosed in SEQ. E). No. 16. The gene predicts an open reading frame (SEQ. ID. No. 1) 
and a protein of 200 amino acids, termed 5'OT-EST, whose structure is disclosed in SEQ. 
ID. No. 2. Comparisons with EST sequences from human and mouse sources and 
alignment with frill length sequences from rat DNA enable the prediction of homologous 
cDNA and protein sequences in these species and these are disclosed in SEQ. ID. No. 3 
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and 4 and SEQ. ED. No. 5 and 6 respectively. The protein sequences predicted from these 
predicted cDNAs are highly homologous, as shown in Figure 6. 

A Not I restriction site is identified approximately 13kB upstream of OT in cV03. As 
5 described in Appendix 2, this site is deliberately destroyed during the construction and 
assembly of cV014 in the pWE15 cosmid vector, as the construct required Not I sites 
only at the ends of the insert. However, sequence analysis of cV014 reveals that the Not- 
1 site Hes in 5'OT-EST, more precisely, m exon w of 5'OT-EST, Furthermore, this 
sequence analysis reveals that in addition to destroying the Not I site, the procedure used 
10 (digestion, filling in and religation) also resuhed in an additional unpredicted deletion of 
412bp. This deletion includes all of the sequences recognised as exon x as defined herein. 
The mutated form of this gene, lacking sequences including those for exon x, is therefore 
termed herein 5 'OT-EST-xdel for the purposes of this appHcation. Its sequence, and the 
structures of the predicted exons from this form of the gene are disclosed in SEQ. ID. Nos. 
15 7 and 8. The presence of 5'OT-EST-xdel in the genome of JP17 and JP59 rats is 
confirmed by the generation of the predicted shorter product upon amplification of 
genomic DNA from these animals by PGR with primers PL266 
(5'TCATGTTGCGGGCTTTGAAC) and PL271 (5'TCTTTCAGTTGCACCCAAGC) 
which flank the deletion (see SEQ. ID. Nos. 1 1 and 12 respectively). 

20 

The form of 5'OT-EST that is incorporated in both JP17 and JP59 in 4 or 1 copies, 
respectively, is mutated from the wild type sequence. 5 'OT-EST-xdel would be predicted 
to give rise to an altered mRNA, which if translated would produce a truncated protein 
product with an additional novel amino acid sequence. The predicted sequence of this 

25 novel product, termed herein 5'OT-EST-xdel is disclosed in SEQ. ID. No. 8. Comparison 
with the aligned predicted protein sequences of 5'OT-EST in normal rats and in other 
species predicts that the protein translated from this RNA would contain an altered exon 
w, with a novel C-terminal peptide sequence (shown in lower case beginning at the arrow 
in Figure 6) predicted to arise by translation of DNA sequences normally present as part 

30 of an intron in 5'OT-EST. Searches in the protein databases in the public domain do not 
fmd any significant matches of this mutated protein sequence to known sequences. 
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To demonstrate that both the endogenous and truncated forms of 5'OT-EST are 
transcribed in JP59 and JP17 rats, PGR primers are designed which can distinguish 
between these gene products. The sequence of these primers is given in SEQ. ID. No. 11, 
12 and 13. RT-PCR using these primers confirms the presence of transcripts from the 
5 endogenous form of 5 'OT-EST in testicular RNA extracts from JP17, JP59 and wild-type 
rats, but the presence of a transcript with the 412bp deletion only in such tissue extracts 
from JP17 and IP59 rats. Sequencing of amplification products generated by PGR with 
primers PL266 (SEQ. ID. No. 1 1) and PL273 (SEQ. ID. No. 13) from wild type and JP17 
rats confirms this region of the sequence of the endogenous rat transcript as well as the 
10 truncated 5'OT-EST-xdel sequence disclosed in SEQ. ID. No. 8. Extracts of a rat adrenal 
medullary cell line (PG12 cells) also contain an RNA product of 5'OT-EST of the 
expected size. 

Identification and sequencing of 5 'OT-EST and 5 VT-EST-xdel enables the design of 
15 probes to carry out in situ hybridisation and RNAse protection analysis for the products 
of these genes on normal and JP17 rat tissue extracts. In situ hybridisation with probes 
complementary to exons w or z (more specifically, corresponding to bases 1020-1 167 and 
2229-2451 of Fig 4.1 respectively) on hypothalamic sections from wild-type or transgenic 
animals, revealed a highly specific expression in magnocellular SON and PVN. No other 
20 specific expression in different brain regions is observed at this level of detection. This is 
an unexpected finding, which is repeatedly confirmed. 5 'OT-EST is a novel member of 
the AVP/OT locus and is expressed in the same hypothalamic magnocellular neurones. 
Similar patterns of expression are seen with both probes and no differences in tissue 
distribution of hybridisation signal are seen between wild-type or JP17 tissues. Further in 
25 situ hybridisation analysis on a wide variety of tissues reveals strong expression in the 
testis consistent in distribution with tubular or Sertoli cell expression. Sparse expression 
is also seen in other tissues, including lung, spleen, intestinal smooth muscle and adrenal 
gland. 

30 From the sequence information, it is further possible to design probes for in situ 
hybridisation analysis that distinguish completely between the forms of mRNA produced 
from 5 'OT-EST and 5 'OT-EST -xdel More specifically, oligonucleotide probes directed 
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against transcripts containing exon x would be predicted to detect 5'OT-EST but not 
5 'OT-EST -xdel transcripts, whilst probes directed against the intron sequence in 5 *OT- 
EST that immediately follows the truncation in 5'OT-EST -xdel detect transcripts 
containing this sequence, that code for the truncated product in extracts from rats 
5 expressing 5'OT-EST -xdel transcripts (such as JP17 and JP59 rats) but not from non- 
transgenic rats. Examples of such probes are given in SEQ. ID. Nos. 14 and 15. 

An oligonucleotide probe of the sequence depicted in SEQ. ID. No. 15 (specific for the 
truncated sequence) is used for in situ hybridization and confirms transgene expression 
10 specifically in PVN and SON in JP17 rats, whereas no signal is observed in PVN or 
SON sections from non-transgenic rats, hybridized at the same time with this probe. 

Nuclease protection analysis may also be performed using a riboprobe to exon w as 
described above. From the sequence we disclose herein, this probe would be predicted to 
15 protect 147bp and 94bp bands from transcripts from 5VT-EST and 5'OT-EST - xdel 
respectively. Using such a probe to analyse testicular RNA extracts confirmed that the full 
length transcript is present in both transgenic and non-transgenic animals and that the 
truncated product is present in JP17 and JP59 extracts in a level consistent with the copy 
number of the cV014 insertion, that the truncated transcript is indeed absent from non- 
20 transgenic testis extracts. The fiill length product is present in control extracts of PCI 2 
cells. 

The gene termed 5'OT-EST -xdel present in cV014 in both JP59 and JP17 rats is 
transcribed in several tissues in JP17 rats, specifically in hypothalamic cells and in 
25 testicular cells, and that the sequence of the truncated transcripts if translated, would give 
rise to a protein product that is severely truncated with respect to the normal gene product 
and would an contain additional novel peptide sequence. 
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EXAMPLE 4 

PHENOTYPIC ANALYSIS OF JP17 TRANSGENIC (SLOB) RATS 



Growth measurements 

5 JP 17 transgenic rats of both sexes and non-transgenic littermates are weighed at 

regular intervals. Male JP 17 rats show a sHght but significant reduction in their body 
weight up to 120 days of age (p<0,01). This juvenile growth retardation is not seen in 
females of this line or the rats of either sex of the JP 59 line whose body weights are not 
significantly different to those of the non-transgenic groups (p>0.7). This effect 

10 disappears with time. At 140 days, the weight difference between JP 17 and non- 
transgenic male rats is no longer significant. Some organs of 140 day old rats are 
dissected and weighed. Heart weights do not differ significantly (p<0.14) but the weights 
of the kidneys (0.99 ± 0.03g in JP 17 rats versus 1.24 ± 0.05g in non-transgenic rats, 
p<0.001), liver (11.11 ± 0.29g in JP 17 rats versus 14.40 ± 0.32g in non-transgenic rats, 

15 p<0.001) and spleen (0.66 ± 0.02g in JP 17 rats versus 0.84 ± 0.03g in non-transgenic 
rats, p<0.001) differs in weight (n=6 in all groups). Disproportionate growth is well 
known in transgenic animals expressing hGH (e.g. Shea et al., 1987) 



Body weight measurements in ageing JP 17 rats. 

20 After about 140 days, the group of JP 17 transgenic male rats gain weight more 

rapidly than their non-transgenic littermates (A weight between 200 and 420 days 356.5 ± 
57.419g for JP 17 males versus 182.50 ± 7.554 for non-transgenic males, p<0.03, n=5. 
Figure 5.1). Female JP 17 transgenic rats show only a slight increase in weight gain when 
compared to non-transgenic littermates (A weight between 280 and 480 days 1 1 1.8 ± 8.2g 

25 for JP 17 females versus 88 ± 5.1g for non-transgenic females, p<0.04, n=6). This is 
illustrated in Figure 8, which clearly shows the sexually dimorphic weight gains in these 
animals. No significant increased weight gain is observed in either sex of transgenic JP 59 
rats compared with non-transgenic JP59 rats. At one year, the weights of the kidneys and 
liver of JP 1 7 male rats have reached a value that is not significantly different than that of 

30 the non-transgenic rats (n=6 in both groups) (p<0.08 for kidneys and livers), but the 
spleens remain lighter (1.03 ± 0.04g versus 1.225 ± 0.05g in wild-type rats, p<0.01). 
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These organs In transgenic JP 59 rats show no variation from their non-transgenic 
Httermate controls (p>0.43). 

Body length, width and fat-pad measurements. 

5 Measurements of the body lengths (nose-anus) and the width across the pelvic area 

of anaesthetised male IP17 and non-transgenic rats are taken (Figure 9). At 20 weeks of 
age male JP 17 transgenic rats are shorter than their Httermate non-transgenic controls 
with an increased width across the pelvic area. At 52 weeks, the difference in nose-anus 
length is no longer significant but the girth of the transgenic JP17 rats has increased 
10 greatly whereas the non-transgenic rats only exhibit a moderate increase in girth. This 
late-onset increase in the body weight/length ratio is shown in Figure 10. 

A comparison of the body proportions of a live SLOB rats and non-transgenic littermates 
shows a marked increase in abdominal fat. This is also obvious when individual peri- 

15 renal fat pads are compared. To evaluate this abdominal distribution of extra fat in SLOB 
rats, peri-renal and testicular fat pads are dissected and weighed from matched groups of 
male JP 17 and non-transgenic littermates of 77, 140 and 365 days of age, and the results 
are shown in Figure 11. The peri-renal fat pads of the transgenic rats are markedly 
increased in weight at both 140 days and 365 days when their mean weight is almost five 

20 times that of the non-transgenic animals. The testicular fat, however, did not show a 
comparable increase. Although testicular fat pad weights are marginally larger than those 
of the non-transgenic rats at 140 days (p<0.05), no further significant increase occurs 
during the period of a large accretion in peri-renal fat, and there is no difference in 
testicular fat pad weights at 365 days between SLOB rats and their non-transgenic 

25 littermates, despite their much larger body weight, and evident gross visceral obesity. 

In other matched groups of 1 year old SLOB rats and non-transgenic littermates of both 
sexes, plasma cholesterol, triglycerides, glucose, insulin, leptin and corticosterone are 
measured in blood samples taken when the animals are killed, and the results are 
30 sixmmarised in Figure 12. Plasma triglycerides are modestly but significantly elevated in 
SLOB males compared to non-transgeruc males. There are no differences in plasma 
triglycerides in females. Cholesterol levels are no different between the groups. Plasma 
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glucose and insulin values are also in the normal range and did not differ between the 
groups, suggesting that the obesity is not secondary to diabetes or insulin resistance. 
Plasma corticosterone is also in the normal range in all groups of rats. Notably however, 
the plasma leptin levels are elevated significantly in both male and female SLOB rats 
5 compared with their non-transgenic littermates, and are almost two-fold higher in SLOB 
males than in SLOB females. Leptin receptor transcript isoforms are also expressed in 
normal amounts in the hypothalamus, piriform cortex and choroid plexus. These increases 
would be expected given their increased body fat, but prompted a study of their food 
intake. 

10 

A further group of 5, 1 1 -month old SLOB rats and 5 non-transgenic rats are housed singly 
in metabolic cages for 14 days, and after a period of acclimatisation to single housing, 
food intake is measured over the last four days of the experiment. There is no significant 
difference m food consumption between the two groups (SLOB rats 23.4 ± l.lg/day vs. 
15 23.5 ± 1 .8 g/day in the non-transgenic males. Mean ± S.D.). 

Although the SLOB phenotype, as demonstrated by the forgoing, has a striking late-onset 
feature, the phenotype is latent at a younger age and can be induced by increasing the 
levels of fat in the diet. This is demonstrated by observing the phenotypic differences 
20 resulting from feeding two groups of 100 day old transgenic and normal littermates either 
regular rat chow, which has a fat content of 4%, or a high fat diet having a fat content of 
30% over a 27 day period. 

The rats fed a normal diet show no significant difference in weight gain between 
25 transgenic and non-transgenic littermates. However, in the case of the rats fed on a 30% 
fat diet, the transgenic animals gain twice the weight of their non-transgenic littermates 
(see Figure 13). Controls in dwarf rats show that the obese phenotype is not due to 
growth hormone deficiency. 

30 Plasma leptin levels are measured at sacrifice. These are found to be higher in transgenic 
animals, and rise in both transgenic and non-transgenic rats fed on a 30% fat diet. 
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Moreover, the increase in dietary fat is associated with a significantly reduced food intake 
in normal rats, but not in SLOB rats, despite their higher leptin levels. 

Induction of obesity in ovariectomised female rats. 

5 Four groups of female rats are studied (see Figure 14). Sham-operated transgenic 

female SLOB rats are lighter than non-transgenic sham-operated female littermates at 100 
days, but gain the same amount of w^eight in the following 11 week period (Awt 45.5 ± 
5.3g, vs Awt 48.4 ± 3.8g). In rats ovariectomised imder anaesthesia, both groups show an 
increase in weight gain; this increase is much higher in SLOB rats than in non-transgenic 

10 littermates (Awt 128 ± 7.7g vs 89 ± 4.3g, P<0.001). Some animals firom each group are 
killed 18 weeks post ovariectomy and their supra-renal fat pads dissected and weighed. 
The fat pad weight is much larger after ovariectomy in SLOB rats (4.67 ± 0.6 Ivs 1.37 ± 
0.39g in ovariectomised versus sham-ovariectomised SLOB females, P<0.01), than in 
nontransgenic rats (2.33 ± 0.94g vs 1.0 ± O.Olg in ovariectomised vs sham- 

15 ovariectomised nontransgenic littermates). 

Fertility of the JP 17 male rats. 

Twelve JP 17 and twelve non-transgenic young adult males are each housed with 
two 12- week old normal females, for several consecutive days. The female rats are 
20 examined every morning for evidence of copulation, either in the form of a vaginal plug 
or sperm in vaginal smears, and are observed for a sufficient amount of time to allow any 
litters conceived during this time to be bom. No litters are sired in this time by JP 17 
males, whereas 1 1 of the 12 females housed with wild-type males produced litters. 

25 The immediate cause of infertility in male SLOB rats is unknovm. The size and gross 
anatomy of their testes and seminal vesicles is normal, suggesting unaffected levels of 
gonadotrophins or androgens. Testicular size, sperm morphology, motility and 
testosterone levels all appear normal in SLOB rats. Treatment of a SLOB male rat with 
exogenous androgens did not improve fertility. One cause could be hGH, since infertility 

30 is a common problem in GH transgenic animals (Y\m et al, 1987, Bartke et al, 1988, 
Flavell et al, 1996) and male transgenic animals expressing hGH have been reported to 
have a reduced frequency in the impregnation of females (Bartke et al, 1992). However 
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female SLOB rats also express equivalent amounts of hGH and are not infertile. 
Furthermore, we found no evidence for hGH expression or of hGH protein in testes from 
SLOB rats. 

In contrast, the expression of 5 'OT-EST in normal rats and the high level of a truncated 
RNA product from 5 'OT-EST -xdel in hypothalamus and in particular, the testis from 
SLOB male rats, and the lack of expression of either product in ovaries in SLOB females, 
leads us to conclude that the novel infertility and obesity phenotype more probably results 
from the presence of multiple copies of 5 'OT-EST -xdel in SLOB rats. A disruption of 
testicular function by 5 'OT-EST -xdel and consequent infertility is part of, may partly 
contribute to, or exacerbate the degree of the male-preponderance of, the obesity 
phenotype of SLOB rats. A testicular disruption is not absolutely required however, since 
a mild visceral obesity can also be discemed in SLOB females. 

Longevity of SLOB male rats. 

The longevity of JP 17 also appears to differ to that of normal rats. Six male JP 17 
rats and six wild-type rats are housed under constant conditions. After two years, all six 
JP 17 rats have died, five at between 10 and 14 months of age and the sixth at 21 months 
of age, whereas only a single wild-type rat has died at 13 months. The longevity of JP 17 
females or JP 59 males or females has not been similarly investigated. 

Comparison of phenotype with other rat obesity models 

When comparing the phenotype in SLOB rats with other findings reported in the 
literature, the closest parallels are lines of transgenic rats expressing hGH driven by a 
25 mouse whey acidic protein promoter, (Ninomiya et al, 1994; Ikedae et al, 1994,1995, 
1997). Ikeda et al (1995) described two lines of rats expressing high or low hGH levels 
in serum. Gigantism is observed in the high hGH-expressing line, but visceral obesity is 
also observed in the low-expressing line, associated with endogenous GH suppression. 
No sexual dimorphism is reported, and the obesity is associated with carbohydrate 
30 metabolic disorders, hypertriglyceridaemia and insulin resistance. Ikeda et al 1995 
specifically concluded that the effect is due to differences in serum hGH levels affecting 
carbohydrate metabolism. A later study in these rats (Ikeda et al 1997) reported female 
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infertility and enlarged ovaries which further distinguishes this phenotype from that seen 
in SLOB rats. 

In common with the rats reported by Ikeda et al (1994), SLOB rats also show reduced rat 
5 GH production and secretion. GH deficiency is associated with increased visceral fat in 
humans, but this can be alleviated by hGH treatment. However, isolated rat GH 
deficiency is an unlikely cause of obesity in SLOB rats as other lines of severely GH 
deficient dwarf rats (Charlton et al, 1988) do not develop obesity when housed under 
identical conditions to SLOB rats. Obesity can be induced in such dwarf rats (as in 
10 normal rats), when placed on high fat diets for prolonged periods though females are more 
susceptible than males (Clark et al, 1996). A similar pituitary GH suppression is also 
evident in female SLOB rats but they do not develop the same massive abdominal obesity 
as males. Pituitary rat GH suppression is also seen in the non-obese JP59 rats of both 
sexes and in Tgr rats (Flavell et al, 1996) which do not develop obesity. 

15 

The defects in other genetic models of obesity in the rat have recently clarified; examples 
of these include the Zucker fa/fa rat, the Koletsky (f) obese rat, the JLA/cp corpulent rat, 
and the OLETF rat, and their related sub strains (lida et al, 1996; Wu-peng et al, 1997; 
Takaya et al, 1996; Lee et al, 1991 \ Kahle et al, 1997;. None of these show the male 

20 specificity, late onset or pattern of distribution of obesity seen in SLOB rats and they 
exhibit significant hyperglycaemia and insulin resistance, which again distinguishes them 
from SLOB rats. Male specificity, infertility, extremely late onset of obesity, a highly 
selective visceral accimiulation of fat, but relatively normal metabolic profile, without 
insulin resistance, hyperphagia or hyperglycaemia distinguishes the dominant phenotype 

25 in the SLOB rats from all other known models of obesity in the rat, including those with 
low endogenous rat GH expression or hGH expression from other transgenes. 
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1. 5'OT-EST polypeptide having a sequence selected from the group comprising 
the sequences set forth in any one of SEQ. ED. Nos. 2, 4 or 6, and sequences substantially 
homologous to any one of the polypeptides set forth in SEQ. ID. Nos. 2, 4 or 6. 

2. A polypeptide according to claim 1 comprising an amino acid sequence encoded by 
at least one exon selected from the group consisting of exons x, y and z as set forth in 
SEQ. ID. No. 16, or equivalents thereof as set forth in any one of SEQ. ID. Nos. 3 or 5. 

3. A polypeptide according to claim 2, which comprises an amino acid sequence 
encoded by at least part of exon w as set forth in SEQ. ID. No. 16, or equivalents thereof as 
set forth in any one of SEQ. ID. Nos. 3 or 5. 

4. A mutant of a 5'OT-EST polypeptide according to any preceding claim which is 
capable, in vivo, of modulating the obesity of an animal expressing it. 

5. A mutant according to claim 4, wherein the animal is a transgenic animal expressing 
the mutant as a result of transformation with a transgene. 

6. A mutant according to claim 4 or claim 5, which comprises the sequence 
PRPRSFSAPFSSQDS, or a sequence substantially homologous thereto. 

7. A mutant according to any one of claims 4 to 6 which comprises the sequence 
MLRALNRLAARPGGQPPTLLLLPVRGPRPRSFSAPFSSQDS, or a sequence substantially 
homologous thereto. 

8. A nucleic acid encoding a 5'OT-EST polypeptide or mutant 5'OT-EST 
polypeptide according to any preceding claim. 

9. A nucleic acid according to claim 8, having a sequence selected from the group 
consisting of any one of SEQ. ID. Nos. 1, 3, 5, 7, 16 or 17; sequences which are 
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hybridisable under stringent conditions with an oligonucleotide comprising 20 contiguous 
bases from any one of SEQ. ID. Nos. 1, 3, 5, 7, 16 or 17; sequences substantially 
homologous to any one of SEQ. ID. Nos. 1, 3, 5, 7, 16 or 17; and sequences complementary 
thereto. 

5 

10. A nucleic acid according to claim 9, comprising the sequence ATGTTGCGGGCTT 
TGAACCGCCTGGCCGCGCGGCCCGGGGGCCAGCCCCCAACCCTGCTCCTTCTGCCCGTGC 
GCGGCCCACGGCCCCGCTCATTCTCGGCTCCTTTTTCCTCGCAGGATAGC, or an 
equivalent sequence which encodes the same polypeptide having regard to the degeneracy 

10 of the nucleic acid code, or a sequence substantially homologous thereto. 

11. A nucleic acid vector comprising a nucleic acid sequence according to any one 
of claims 8 to 1 1 . 



15 12. A vector according to claim 11 which is a cosmid vector. 

13. A vector according to claun 11 or claim 12 further comprising the sequences of 
the oxytocin (OT) gene, the vasopressin (A VP) gene and/or the human growth hormone 
(hGH) gene. 

20 

14. A vector according to claim 12 having the structure of cV014 as set forth in 
Figure 4 (SEQ. ID. No. 17). 

15. A cell transfonned with a vector according to any one of claims 1 1 to 13. 

25 

16. A method for producing a 5'OT-EST polypeptide or a mutant 5'OT-EST 
polypeptide according to any one of claims 1 to 7, comprising transforming a cell with a 
vector according to any one of claims 11 to 13 and culturing the cell to produce the 
polypeptide. 
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17. A transgenic non-human animal expressing, as a result of transgene expression, a 
5'OT-EST polypeptide or mutant 5'OT-EST polypeptide according to any one of claim 1 
to 7. 

18. A transgenic animal according to claim 17, which has been transformed with a 
vector according to any one of claims 12 to 14. 

19. A transgenic animal according to claim 17 or claim 18, comprising more than one 
copy of the transgene. 

20. A transgenic animal according to any one of claims 17 to 19, which is a manunal. 

21 . A transgenic animal according to claim 20 which is a rat. 

22. A transgenic rat comprising at least four concatameric copies of a transgene 
having the structure of cV014 as set forth in Figure 4 (SEQ. ID. No. 17). 

23. A non-hximan mammal possessing the following obese phenotype: (i) a very late 
onset of obesity, (ii) a highly selective visceral distribution of fat developing on a normal 
rodent diet, without hyperphagia, (iii) an effect greatly preponderant in males, (iv) a 
predisposition to excessive dietary-fat induced obesity at an early age, before the 
phenotype becomes apparent on a normal diet, and (v) a dominant pattem of inheritance; 
the non-human mammal being obtainable by transformation with a vector accordiag to 
any one of claims 1 1 to 14. 

24. Use of an animal according to any one of claims 17 to 23 as a model for human 
late onset obesity, human dietary-fat associated juvenile obesity, human female post- 
menopausal obesity and/or human male infertility. 

25. A method for identifying a compound or compounds capable of modulating 
obesity and/or infertility in a mammal, comprising the steps of: 
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a) exposing an animal according to any one of claims 1 7 to 24 to the compound or 
compounds to be tested; 

b) detemiining the effect of the compound on the obesity and/or infertility 
phenotype; and 

5 c) selecting the compound or compounds which are capable of modulating the 

obesity and/or infertility phenotype in the desired manner. 

26. A method for producing a compound or compounds capable of modulating obesity 
and/or infertility in a mammal, comprising the steps of: 

10 a) exposing an animal according to any one of claims 17 to 24 to the compound or 

compounds to be tested; 

b) determining the effect of the compound on the obesity and/or infertility 
phenotype; 

c) selecting the compound or compounds which are capable of modulating the 
15 obesity and/or infertility phenotype in the desired maimer; and 

d) producing the compound or compounds by conventional isolation or synthesis 
techniques. 

27. A method for identifying a candidate compound capable of influencing lipid 
20 transport, comprising the steps of: 

a) contacting 5'OT-EST polypeptide with a candidate compound or compounds 
and determining which candidate compound or compounds is capable of interacting 
with5'OT-EST; 

b) optionally, testing candidate compounds which interact with 5'OT-EST in a 
25 method according to claim 25. 

28. A diagnostic reagent for the detection of mutations, polymorphisms or other 
changes in 5 'OT-EST which may predispose an individual to obesity. 

30 29. Use of a tissue derived from a transgenic animal according to any one of claims 17 
to 24 in a screen to identify a genetic cause of obesity, comprising the steps of: 
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a) isolating one or more gene products from tissue derived from a transgenic animal 
according to any one of claims 1 7 to 24; and 

b) determining whether the expression of a gene product is correlated with obesity. 
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Abstract 

The invention describes a previously unknown gene, termed J 'OT-EST, which is 
responsible for inducing an obesity and/or infertihty phenotype in transgenic animals, and 
5 transgenic animals comprising mutants of S'OT-EST which are useful for assaying 
compounds for the treatment of obesity and/or infertility. 



FCT/GB 9 9 / 0 2 6 5 8 



SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: MEDICAL RESEARCH COUNCIL 

(B) STREET: 2 0 PARK CRESCENT 

(C) CITY: LONDON 

(E) COUNTRY: UK 

(F) POSTAL CODE (ZIP) : WIN 4AL 
(ii) TITLE OF INVENTION: GENE 

(iii) NUMBER OF SEQUENCES: 16 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS -DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.30 (EPO) 



(2) INFORMATION FOR SEQ ID NO : 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 924 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 5 . . 604 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1: 

TGTC ATG TTG CGG GCT TTG AAC CGC CTG GCC GCG CGG CCC GGG GGC CAG 4 9 

Met Leu Arg Ala Leu Asn Arg Leu Ala Ala Arg Pro Gly Gly Gin 
15 10 15 

CCC CCA ACC CTG CTC CTT CTG CCC GTG CGC GGC CGC AAG ACC CGC CAC 9 7 

Pro Pro Thr Leu Leu Leu Leu Pro Val Arg Gly Arg Lys Thr Arg His 
20 25 30 

GAT CCG OCT GCC AAG TCC AAG GTC GGG CGC GTG AAA ATG CCT CCT GCA 14 5 

Asp Pro Pro Ala Lys Ser Lys Val Gly Arg Val Lys Met Pro Pro Ala 
35 40 45 



GTG GAC CCT GCG GAA TTG TTC GTG TTG ACC GAG CGC TAG CGA CAG TAG 
Val Asp Pro Ala Glu Leu Phe Val Leu Thr Glu Arg Tyr Arg Gin Tyr 
50 55 60 
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CGG GAG ACG GTG CGC GCT CTC AGG CGA GAG TTC ACA TTG GAG GTG CGA ' 241 

Arg Glu Thr Val Arg Ala Leu Arg Arg Glu Phe Thr Leu Glu Val Arg 
65 70 75 

GOG AAA TTG CAC GAG GCC CGA GCC GGG GTT CTG GCT GAG CGC AAG GCG 289 
Gly Lys Leu His Glu Ala Arg Ala Gly Val Leu Ala Glu Arg Lys Ala 
80 85 90 95 

CAA GAG GCC ATC AGA GAG CAC CAG GAG CTG ATG GCC TGG AAC CGG GAG 33 7 

Gin Glu Ala lie Arg Glu His Gin Glu Leu Met Ala Trp Asn Arg Glu 
100 105 110 

GAG AAC CGG AGA CTG CAG GAA CTA CGG ATA GCT AGG TTG CAG CTC GAA 3 85 

Glu Asn Arg Arg Leu Gin Glu Leu Arg lie Ala Arg Leu Gin Leu Glu 
115 120 125 

GCA CAG GCC CAG GAG CTG CGG CAG GCT GAG GTC CAG GCC CAG AGG GCC 433 
Ala Gin Ala Gin Glu Leu Arg Gin Ala Glu Val Gin Ala Gin Arg Ala 
130 135 140 

CAG GAG GAG CAG GCT TGG GTG CAA CTG AAA GAA CAA GAA GTT CTC AAA 4 81 

Gin Glu Glu Gin Ala Trp Val Gin Leu Lys Glu Gin Glu Val Leu Lys 
145 150 155 

CTG CAG GAG GAG GCC AAA AAC TTC ATC ACT CGG GAG AAC CTG GAG GCA 52 9 

Leu Gin Glu Glu Ala Lys Asn Phe lie Thr Arg Glu Asn Leu Glu Ala 
160 165 170 175 

CGG ATA GAA GAG GCC TTG GAC TCT CCG AAG AGT TAT AAC TGG GCG' GTC 577 
Arg lie Glu Glu Ala Leu Asp Ser Pro Lys Ser Tyr Asn Trp Ala Val 
180 185 190 

ACC AAA GAA GGG CAG GTG GTC AGG AAC TGAGAACAGA GGCCTCTCAG 62 4 

Thr Lys Glu Gly Gin Val Val Arg Asn 
195 200 

GCCCAAATAA GGACAGTGCT TGCCTAGGGA CTGGATATTG GGGTAGAAAT TGGTGCATCC 6 84 

CAGGAGGGTG GCACAGCCTT GTCCAGAGCA GCCCCCATTC ATTCTAGATT TGGCACCAGG 7 44 

TATAGTACCT GTTCTGACAC CACATACAAA CTCCGGACAG CATTAAACTC TGGGAAGTTC 8 04 

CTATCACACA GAAGATCAGA CTGGACTGTC CCCTCTAGAA GCCAAGAGCT GTCTCCTGAG 8 64 

TTTCTTGGAA TAGTGTGAGC CCAATGTTTC CTGCTTTTAT AAATAAACTA TTGGAAAGCA 92 4 



(2) INFORMATION FOR SEQ ID NO : 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 00 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2: 



Met Leu Arg Ala Leu Asn Arg Leu Ala Ala Arg Pro Gly Gly Gin Pro 
15 10 15 
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Pro Thr Leu Leu Leu Leu Pro Val Arg Gly Arg Lys Thr Arg His Asp 
20 25 30 

Pro Pro Ala Lys Ser Lys Val Gly Arg Val Lys Met Pro Pro Ala Val 
35 '40 45 

Asp Pro Ala Glu Leu Phe Val Leu Thr Glu Arg Tyr Arg Gin Tyr Arg 
50 55 60 

Glu Thr Val Arg Ala Leu Arg Arg Glu Phe Thr Leu Glu Val Arg Gly 
65 70 75 80 

Lys Leu His Glu Ala Arg Ala Gly Val Leu Ala Glu Arg Lys Ala Gin 
85 90 95 

Glu Ala He Arg Glu His Gin Glu Leu Met Ala Trp Asn Arg Glu Glu 
100 105 110 

Asn Arg Arg Leu Gin Glu Leu Arg He Ala Arg Leu Gin Leu Glu Ala 
115 120 125 

Gin Ala Gin Glu Leu Arg Gin Ala Glu Val Gin Ala Gin Arg Ala Gin 
130 135 140 

Glu Glu Gin Ala Trp Val Gin Leu Lys Glu Gin Glu Val Leu Lys Leu 
145 150 155 160 

Gin Glu Glu Ala Lys Asn Phe He Thr Arg Glu Asn Leu Glu Ala Arg 
165 170 175 

He Glu Glu Ala Leu Asp Ser Pro Lys Ser Tyr Asn Trp Ala Val Thr 
180 185 190 

Lys Glu Gly Gin Val Val Arg Asn 
195 200 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 998 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL : NO 
(iv) ANTI-SENSE : NO 



(ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .615 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



ATG CTA CGC GCG CTG AGC CGC CTG GGC GCG GGG ACC CCG TGC AGG CCC 
Met Leu Arg Ala Leu Ser Arg Leu Gly Ala Gly Thr Pro Cys Arg Pro 
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205 210 215 

CGG GCC CCT CTG GTG CTG CCA GCG CGC GGC CGC AAG ACC CGC CAC GAC 9 6 

Arg Ala Pro Leu Val Leu Pro Ala Arg Gly Arg Lys Thr Arg His Asp 
220 225 230 

CCG CTG GCC AAA TCC AAG ATC GAG CGA GTG AAC ATG CCG CCC GCG GTG 14 4 

Pro Leu Ala Lys Ser Lys lie Glu Arg Val Asn Met Pro Pro Ala Val 
235 240 245 

GAC CCT GCG GAG TTC TTC GTG CTG ATG GAG CGT TAG CAG CAC TAG CGC 192 
Asp Pro Ala Glu Phe Phe Val Leu Met Glu Arg Tyr Gin His Tyr Arg 
250 255 260 

CAG ACC GTG CGC GCC CTC AGG ATG GAG TTC GTG TCC GAG GTG CAG AGG 240 
Gin Thr Val Arg Ala Leu Arg Met Glu Phe Val Ser Glu Val Gin Arg 
265 270 275 280 

AAG GTG CAC GAG GCC CGA GCC GGG GTT CTG GCG GAG CGC AAG GCC CTG 288 
Lys Val His Glu Ala Arg Ala Gly Val Leu Ala Glu Arg Lys Ala Leu 
285 290 295 

AAG GAC GCC GCC GAG CAC CGC GAG CTG ATG GCC TGG AAC CAG GCG GAG 3 36 

Lys Asp Ala Ala Glu His Arg Glu Leu Met Ala Trp Asn Gin Ala Glu 
300 305 310 

AAC CGG CGG CTG CAC GAG CTG CGG ATA GCG AGG CTG CGG CAG GAG GAG 3 84 

Asn Arg Arg Leu His Glu Leu Arg lie Ala Arg Leu Arg Gin Glu Glu 
315 320 325 

CGG GAG CAG GAG CAG CGG CAG GCG TTG GAG CAG GCC CGC AAG GCC GAA 43 2 

Arg Glu Gin Glu Gin Arg Gin Ala Leu Glu Gin Ala Arg Lys Ala Glu 
330 335 340 

GAG GTG CAG GCC TGG GCG CAG CGC AAG GAG CGG GAA GTG CTG CAG CTG 4 80 

Glu Val Gin Ala Trp Ala Gin Arg Lys Glu Arg Glu Val Leu Gin Leu 
345 350 355 360 

CAG GAA GAG GTG AAA AAC TTC ATC ACC CGA GAG AAC CTG GAG GCA CGG 52 8 

Gin Glu Glu Val Lys Asn Phe lie Thr Arg Glu Asn Leu Glu Ala Arg 
365 370 375 

GTG GAA GCA GCA TTG GAC TCC CGG AAG AAC TAG AAC TGG GCC ATC ACC 57 6 

Val Glu Ala Ala Leu Asp Ser Arg Lys Asn Tyr Asn Trp Ala lie Thr 
380 385 390 

AGA GAG GGG CTG GTG GTC AGG CCA CAA CGC AGG GAC TCC TAGGGGCCCA 62 5 

Arg Glu Gly Leu Val Val Arg Pro Gin Arg Arg Asp Ser 
395 400 405 

GTAAGGACAG TGCCCGCCAG GGACCATGTA TGTATCATGG CGGAAGAGTT GGCCCTGACC 685 

TGGAATAAAG CAGTTGGTGT TGCTTATGAG GAAGGTTCAG CCTTATCCAG CACAGCCTTC 74 5 

ACGTTTTGCC CTCTGCTGTC ACCACTTGGT CAGAAACTTC CAAACGCAGT GCCCTGTTCT 8 05 

GCCGGTGTGT AAAGCCTCAG CGCACCAGGA GACCCTAGAG TGGTTTCCAT CTCACAGAGA 8 65 

ATCAGACAGG CCACAGCCCC CTCAGGCAGC CAGGTCATCT GAGTATCATT AAGAGTAGTG 925 

ATGGGAAGAT TACAGTCTGA GGGCCAAACG TGCCTGCTTC CTGTTTTTGT AAATAAAGTT 985 



PCT/GB 9 9 / 0 2 6 5 8 



TTGTTGGAAC ACA 59 8 



(2) INFORMATION FOR SEQ ID NO : 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 05 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Leu Arg Ala Leu Ser Arg Leu Gly Ala Gly Thr Pro Cys Arg Pro 
15 10 15 

Arg Ala Pro Leu Val Leu Pro Ala Arg Gly Arg Lys Thr Arg His Asp 
20 25 30 

Pro Leu Ala Lys Ser Lys lie Glu Arg Val Asn Met Pro Pro Ala Val 
35, 40 45 

Asp Pro Ala Glu Phe Phe Val Leu Met Glu Arg Tyr Gin His Tyr Arg 
50 55 60 

Gin Thr Val Arg Ala Leu Arg Met Glu Phe Val Ser Glu Val Gin Arg 
65 70 75 80 

Lys Val His Glu Ala Arg Ala Gly Val Leu Ala Glu Arg Lys Ala Leu 
85 90 95 

Lys Asp Ala Ala Glu His Arg Glu Leu Met Ala Trp Asn Gin Ala Glu 
100 105 110 

Asn Arg Arg Leu His Glu Leu Arg He Ala Arg Leu Arg Gin Glu Glu 
115 120 125 

Arg Glu Gin Glu Gin Arg Gin Ala Leu Glu Gin Ala Arg Lys Ala Glu 
130 135 140 

Glu Val Gin Ala Trp Ala Gin Arg Lys Glu Arg Glu Val Leu Gin Leu 
145 150 155 160 

Gin Glu Glu Val Lys Asn Phe He Thr Arg Glu Asn Leu Glu Ala Arg 
165 170 175 

Val Glu Ala Ala Leu Asp Ser Arg Lys Asn Tyr Asn Trp Ala He Thr 
180 185 190 

Arg Glu Gly Leu Val Val Arg Pro Gin Arg Arg Asp Ser 
195 200 205 

(2) INFORMATION FOR SEQ ID NO : 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 943 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



r 



PDT/GB S 9 / 0 2 6 5 8 



(ii) MOLECULE TYPE: cDNA 



(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 5 . . 604 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5: 



TGTC ATG TTG CGC GCT CTG AAC CGC CTG GCG CAG CGG CCG GGA GAC CGG 4 9 

Met Leu Arg Ala Leu Asn Arg Leu Ala Gin Arg Pro Gly Asp Arg 
210 215 220 



CCC CCG ACC CCG CTG CTC CTG CCC GTG CGC GGC CGC AAG ACC CGC CAT 97 
Pro Pro Thr Pro Leu Leu Leu Pro Val Arg Gly Arg Lys Thr Arg His 
2 2 5 2 3 0 2 3 5 



GAC CCG CCT GCC AAA TCC AAG GTC GGA CGG GTG CAG ACG CCT CCC GCC 14 5 

Asp Pro Pro Ala Lys Ser Lys Val Gly Arg Val Gin Thr Pro Pro Ala 
240 245 250 



GTG GAC CCT GCG GAA TTC TTC GTG TTG ACC GAG CGC TAC GGA CAG TAC 193 
Val Asp Pro Ala Glu Phe Phe Val Leu Thr Glu Arg Tyr Gly Gin Tyr 
255 260 265 

CGG GAG ACC GTG CGC GCT CTC AGG CTA GAG TTC ACG TTG GAT GTG CGA 241 
Arg Glu Thr Val Arg Ala Leu Arg Leu Glu Phe Thr Leu Asp Val Arg 
270 275 280 



AGG AAA TTG CAC GAG GCC CGA GCC GGG GTT CTG GCC GAG CGC AAG GCG 2 89 

Arg Lys Leu His Glu Ala Arg Ala Gly Val Leu Ala Glu Arg Lys Ala 
285 290 295 300 



CAG CAG GCC ATC ACG GAG CAC CGG GAG CTG ATG GCC TGG AAC CGG GAC 3 37 

Gin Gin Ala lie Thr Glu His Arg Glu Leu Met Ala Trp Asn Arg Asp 
305 310 315 



GAG AAC CGG CGA ATG CAG GAG CTA CGG ATA GCG AGG TTG CAG CTG GAA 3 85 

Glu Asn Arg Arg Met Gin Glu Leu Arg lie Ala Arg Leu Gin Leu Glu 
320 325 330 



GCA CAG GCC CAG GAG GTG CAG AAG GCT GAG GCC CAG CGC CAG AGG GCT 43 3 

Ala Gin Ala Gin Glu Val Gin Lys Ala Glu Ala Gin Arg Gin Arg Ala 
335 340 345 



CAG GAG GAG CAG GCT TGG GTG CAA CTG AAA GAG CAA GAA GTG CTC AAG 4 81 

Gin Glu Glu Gin Ala Trp Val Gin Leu Lys Glu Gin Glu Val Leu Lys 

350 355 360 

CTG CAG GAG GAG GCA AAA AAC TTC ATC ACT CGG GAG AAC CTG GAG GCA 52 9 

Leu Gin Glu Glu Ala Lys Asn Phe lie Thr Arg Glu Asn Leu Glu Ala 

365 370 375 380 

CGG ATA GAA GAA GCG TTG GAC TCT CCG AAG AGT TAC AAC TGG GCC GTC 577 

Arg lie Glu Glu Ala Leu Asp Ser Pro Lys Ser Tyr Asn Trp Ala Val 
385 390 395 
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ACC AAA GAA GGG GAG GTG GTC AGG AAC TGAGCACAGA GACTTCTGGG 624 
Thr Lys Glu Gly Gin Val Val Arg Asn 
400 405 

GGCCCAAATA AGCACAGTGC TTGCCTAGGG TCTGTGTACT GGGATAGGAA TTGGTACATC 6 84 

CCAGGAGGAT GGCTCAGCCG TTTCCAGAGC AACCTCAGTC ACTCCAGGCT CGGCACTCAC 7 44 

CACCTGACTG GGAACTCCCA GATGTCCCTG TTCTGGCACC ACAGTCAAAC TGAGGGCAGC 8 04 

ATTAAACTCT GGGAAGTTCC TATCGCACAG AGGATCGGAC TGGACTGTGT CCCTCTAGAA 8 64 

GCCAAGCTTG TCTTGTAAGT CTCTTGGAGT CCTGTGAGCC AAATGTTTCC TGCTTTTATA 92 4 

AATAAAGTAT TGGAGCCCA ^4 3 

(2) INFORMATION FOR SEQ ID NO : 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 00 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6: 

Met Leu Arg Ala Leu Asn Arg Leu Ala Gin Arg Pro Gly Asp Arg Pro 
15 10 15 

Pro Thr Pro Leu Leu Leu Pro Val Arg Gly Arg Lys Thr Arg His Asp 
20 25 30 

Pro Pro Ala Lys Ser Lys Val Gly Arg Val Gin Thr Pro Pro Ala Val 
35 40 45 

Asp Pro Ala Glu Phe Phe Val Leu Thr Glu Arg Tyr Gly Gin Tyr Arg 
50 55 60 

Glu Thr Val Arg Ala Leu Arg Leu Glu Phe Thr Leu Asp Val Arg Arg 
65 70 75 80 

Lys Leu His Glu Ala Arg Ala Gly Val Leu Ala Glu Arg Lys Ala Gin 
85 90 95 

Gin Ala lie Thr Glu His Arg Glu Leu Met Ala Trp Asn Arg Asp Glu 
100 105 110 

Asn Arg Arg Met Gin Glu Leu Arg lie Ala Arg Leu Gin Leu Glu Ala 
115 120 125 

Gin Ala Gin Glu Val Gin Lys Ala Glu Ala Gin Arg Gin Arg Ala Gin 
130 135 140 

Glu Glu Gin Ala Trp Val Gin Leu Lys Glu Gin Glu Val Leu Lys Leu 
145 150 155 160 

Gin Glu Glu Ala Lys Asn Phe lie Thr Arg Glu Asn Leu Glu Ala Arg 
165 170 175 
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He Glu Glu Ala Leu Asp Ser Pro Lys Ser Tyr Asn Trp Ala Val Thr 
180 185 190 

Lys Glu Gly Gin Val Val Arg Asn 
195 200 

(2) INFORMATION FOR SEQ ID NO : 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2852 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Rat 5 ' OT-EST-xdel 

(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 1026 . . 1270 

(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION: 17 99 . .2235 

(ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 103 0 . .115 2 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7: 
TGACCTCTGT GGATCTGATA TACATGTAAG TGACAGACCA TCCGAGCTAT ATAGTGAGAC 
CTGTGCAAGG AAGGATGGAG TGCACGTTCC CTGATGTTCA GAGCAACCCT GTGTCACTCC 
AGGTAGGTGA GATGAGAGGA AGAGGGTGGC CTTGGCCTGG GCCTCCTACG GGCCTGGAAG 
TTGGGAGAAG GATGTAAGCA GACTCTGTTC TCTTCTGAGA AATATCAGGT ATTGCAGTCA 
GCCCAGGCTC CTCAGACCCT CCTAAGTGCA GATTCTCTGC AGAATCTGGT GTTGACAACA 
CTAATGAGTA GGATGAGACT TCAGTTCCCT AGCCCTCACC GTCAGCTTCT GATTACCAAC 
AACTCTCCCA GAGGAGAGCC ATCTACCTTT GGGACAGATG CTCTCTGCCC TGCACTGCCT 
CCTGTTTCTC TTCATTGTAG AGGAAGATAG TACTTTAAAA GCTTCATAAA TGGTCTCAAG 
GTGGGAAGAC CCCGGCTCAG GTGAAAGAGG ACAAGCGTCA CCTCACACAG GCCACCCAGT 
AGAAAACAAG TGATCACTGA TACTGAGAAC TCTGGCAATT GCAGAGCTGC CCAAGACCAC 
AACAGGGCAG TGCAATGCAA GGAAAAGGTT TGTTGCTCGA TTGCAAACCT AAAGTTTAAA 660 



60 
120 
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240 
300 
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540 
600 
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GTGCATCAGG AGAACGCTTA CTCAAAGAGG AAGTGTAAGC CTAACTTAAG TAGCTAGAAG 72 0 

CTCAGAATTT CTTGCATCAG CCCTGGAAGG GTACACAGGC CACCGGTGGG CCAGAGAACC 7 80 

ACACGCTTTG GGGCGGTGTC CAAGCTTGTG AACAAGTAGG CAAGAGCGCC TGGTGTTGTA 84 0 

GCTGTCATTG GCGGGCAATA CAGCCCAGCG AACTGTGGTC TCCAAGGTGC CCCTCGACCC 90 0 

TCCCACTCTA CCCGAGACTC CAGGGACGCG ATGGGCCAGA CAGCAAGAGC TCCGCCTACG 96 0 

GGGGCGGGGA CAGGAGATTC CCGTGATGCT CCTCGACCAC TTCCGGACAG GGCGCAGGCG 102 0 

CTAGCTGTC ATG TTG CGG GCT TTG AAC CGC CTG GCC GCG CGG CCC GGG 10 6 8 
Met Leu Arg Ala Leu Asn Arg Leu Ala Ala Arg Pro Gly 
205 210 

GGC GAG CCC CCA ACC CTG CTC CTT CTG CCC GTG CGC GGC CCA CGG CCC 1116 
Gly Gin Pro Pro Thr Leu Leu Leu Leu Pro Val Arg Gly Pro Arg Pro 
215 220 225 

CGC TCA TTC TCG GCT CCT TTT TCC TCG CAG GAT AGC TAGGTTGCAG 1162 
Arg Ser Phe Ser Ala Pro Phe Ser Ser Gin Asp Ser 
230 235 240 

CTCGAAGCAC AGGCCCAGGA GCTGCGGCAG GCTGAGGTCC AGGCCCAGAG GGCCCAGGAG 122 2 

GAGCAGGCTT GGGTGCAACT GAAAGAACAA GAAGTTCTCA AACTGCAGGT GGGCCGAGGT 12 8 2 

CGTGAGGAAT GTGGGTATTG GAGATTCCGG TGAGGGAGGC TCTGGGGAGA GCAGCACAGG 13 4 2 

GTGTCAAGTG ACCAGTCTTC AGGAGGCTTC TCTCTCTGCT CTGCACACAC AGAGTGCCTC 14 0 2 

CCAGACAATG GTCAATGAAA GGTTACAGGC TAGTATTGCC GTGTGAAACT TGAAGGTCAG 14 6 2 

GGAAACCATA AATGAGAATG GAGCTGTTTT TATTGTGTAA GGGAGAGTGA CAAGGTTGAG 15 2 2 

AGAGTCCACC ACCCCGCACC TCCCCCCGCC CCCAATCAGG TTGTCACGAT TCGATTCGTT 15 82 

CTTGGGTTGT GGCTGAGAGA TCTGATGGGT AATTGTCCGA GGAAGAGGGA TATAATGGTT 164 2 

GAGGTCACCT AGTACAGTTG TGCTGGCCTA TTGGTGGGAC ACTCAAAGGG GCCCTGGGCT 17 02 

CTTTTGACAC CCTTCTTAAG GTGGGCTAGA GACAGTAAGT TATGCAGGCA GCCAGCTCTG 17 62 

AGAGATCCCA CGTAGCTAAC CTTTCTCTTC CCGTAGGAGG AGGCCAAAAA CTTCATCACT 1822 

CGGGAGAACC TGGAGGCACG GATAGAAGAG GCCTTGGACT CTCCGAAGAG TTATAACTGG 18 82 

GCGGTCACCA AAGAAGGGCA GGTGGTCAGG AACTGAGAAC AGAGGCCTCT CAGGCCCAAA 194 2 

TAAGGACAGT GCTTGCCTAG GGACTGGATA TTGGGGTAGA AATTGGTGCA TCCCAGGAGG 2 0 02 

GTGGCACAGC CTTGTCCAGA GCAGCCCCCA TTCATTCTAG ATTTGGCACC AGGTATAGTA 2 0 62 

CCTGTTCTGA CACCACATAC AAACTCCGGA CAGCATTAAA CTCTGGGAAG TTCCTATCAC 212 2 

ACAGAAGATC AGACTGGACT GTCCCCTCTA GAAGCCAAGA GCTGTCTCCT GAGTTTCTTG 2182 

GAATAGTGTG AGCCCAATGT TTCCTGCTTT TATAAATAAA CTATTGGAAA GCAAAGCCTT 2 2 42 

TTGTTATGTG GCTTGCTTTT TCTTGTTGTA GAATAAGTTT ATTTGTCCCA GTTATTTGGG 2 3 02 
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TCTTAAGGTT ATTAGCCAAA AGCCAGTTCA CCTAACTGAG CCAGGAGTTA GTTATCTGCT 2 3 62 

TTGCTCAATC CTGGGCTTTG CTGGGTAGGG TCAGGTGTGT CCAAGGTCCA GAAAGCAAAA 24 2 2 

AGGGTGCCCC GTTTCTCCTG GGAAGGCTTC CCCGTCAGTG ATTTCTGTAA CCGGACCCTG 2 4 82 

CCCTGACACA GCGTCATTGG ACTACCCAGC AGACAGTAGA CTCCACTCTA AACCCGCTTC 2 542 

TTGCGGTCAG TTGCTGTCCT TCAGTGTGTG TAAGCAGTGG CCAGACAGCA CCCTTGGGTG 2 6 02 

TCATTTCAAG ACTCTCTCAC CTTGGTCTGC TTTACGTTTG GTTTGATTTG GTTTGTTCTG 2 662 

GTTTTTGAGA CGAGGCCTTT CACTGGAACC TGGCACTCAG TATTTAGACT GCCCAGCCAG 2 722 

CTAGCCTCAG AGAATGCATC TGCGTATGCT TGCCTGGCGC TGGAATTCGG TGCACATGGC 2 7 82 

TTTGATGTGT ACCGGGGATC AGACACAGAT GTTTCATGAG TGCAGTGCAT GCCTGTTAGT 2 842 
GGTAGAGCTC 

(2) INFORMATION FOR SEQ ID NO : 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Leu Arg Ala Leu Asn Arg Leu Ala Ala Arg Pro Gly Gly Gin Pro 
15 10 15 

Pro Thr Leu Leu Leu Leu Pro Val Arg Gly Pro Arg Pro Arg Ser Phe 
20 25 30 

Ser Ala Pro Phe Ser Ser Gin Asp Ser 
35 40 

(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "Synthetic Primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO : 9: 
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TTCACACCAC TCTGTCGAAC 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic Primer" 

(iii) HYPOTHETICAL: NO ' ' 

(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
AGGAGGAAGA CAGGTGAAAG 

(2) INFORMATION FOR SEQ ID NO : 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic Primer" 




(iii) HYPOTHETICAL: NO 



(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
TCATGTTGCG GGCTTTGAAC 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic Primer" 

(iii) H YPOTHET I CAL : NO 



(iv) ANTI-SENSE: YES 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
TCTTTCAGTT GCACCCAAGC 2 0 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: Other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic Primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GTGATAGGAA CTTCCCAGAG 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic Primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GCCTCGTGCA ATTTCCCTCG CACCTCCAAT GTGAACTCTC GC 4 2 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic Primer" 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
TCCTGCGAGG AAAAAGGAGC CGAGAATGAG CGGGGCCGTG GG 
(2) INFORMATION FOR SEQ ID NO : 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3264 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Rat 5'OT-EST 

(ix) FEATURE: 

(A) NAME/KEY: exon w 

(B) LOCATION: 102 6 . .124 1 

(ix) FEATURE: 

(A) NAME/KEY: exon x 

(B) LOCATION: 1332 . .1478 

(ix) FEATURE: 

(A) NAME/KEY: exon y 

(B) LOCATION: 155 9 . .1682 

(ix) FEATURE: 

(A) NAME/KEY: exon z 

(B) LOCATION: 2211 . .2647 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
TGACCTCTGT GGATCTGATA TACATGTAAG TGACAGACCA TCCGAGCTAT ATAGTGAGAC 
CTGTGCAAGG AAGGATGGAG TGCACGTTCC CTGATGTTCA GAGCAACCCT GTGTCACTCC 
AGGTAGGTGA GATGAGAGGA AGAGGGTGGC CTTGGCCTGG GCCTCCTACG GGCCTGGAAG 
TTGGGAGAAG GATGTAAGCA GACTCTGTTC TCTTCTGAGA AATATCAGGT ATTGCAGTCA 
GCCCAGGCTC CTCAGACCCT CCTAAGTGCA GATTCTCTGC AGAATCTGGT GTTGACAACA 
CTAATGAGTA GGATGAGACT TCAGTTCCCT AGCCCTCACC GTCAGCTTCT GATTACCAAC 



42 
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AACTCTCCCA 
CCTGTTTCTC 
GTGGGAAGAC 
AGAAAACAAG 
AACAGGGCAG 
GTGCATCAGG 
CTCAGAATTT 
ACACGCTTTG 
GCTGTCATTG 
TCCCACTCTA 
GGGGCGGGGA 
CTAGCTGTCA 
ACCCTGCTCC 
AAGGTCGGGC 
GAGCGCTACC 
GGCCTTCGGC 
TCCTCACCCA 
GGGGTTCTGG 
TGGAACCGGG 
TGGGCTGGGC 
TAGCTAGGTT 
AGAGGGCCCA 
AGGTGGGCCG 
GAGAGCAGCA 
ACACAGAGTG 
AACTTGAAGG 
GTGACAAGGT 
CGATTCGATT 
GGGATATAAT 
AGGGGCCCTG 
GGCAGCCAGC 



GAGGAGAGCC 
TTCATTGTAG 
CCCGGCTCAG 
TGATCACTGA 
TGCAATGCAA 
AGAACGCTTA 
CTTGCATCAG 
GGGCGGTGTC 
GCGGGCAATA 
CCCGAGACTC 
CAGGAGATTC 
TGTTGCGGGC 
TTCTGCCCGT 
GCGTGAAAAT 
GACAGTACCG 
GCCCCCTGGG 
GGCGAGAGTT 
CTGAGCGCAA 
AGGAGAACCG 
TAGGCTCACC 
GCAGCTCGAA 
GGAGGAGCAG 
AGG T C G T GAG 
CAGGGTGTCA 
CCTCCCAGAC 
TCAGGGAAAC 
TGAGAGAGTC 
CGTTCTTGGG 
GGTTGAGGTC 
GGCTCTTTTG 
TCTGAGAGAT 



ATCTACCTTT 
AGGAAGATAG 
GTGAAAGAGG 
TACTGAGAAC 
GGAAAAGGTT 
CTCAAAGAGG 
CC CTGGAAGG 
CAAGCTTGTG 
CA.GCCCAGCG 
CAGGGACGCG 
CCGTGATGCT 
TTTGAACCGC 
GCGCGGCCGC 
GCCTCCTGCA 
GGAGACGGTG 
AAGTGCTGGG 
CACATTGGAG 
GOCGCl\AGAQ 
GAGACTGCAG 
CACGGCCCCG 
GCACAGGCCC 
GCTTGGGTGC 
GAAT G TGG G T 
AGTGACCAGT 
AATGGTCAAT 
CATAAATGAG 
CACCACCCCG 
TTGTGGCTGA 
ACCTAGTACA 
ACACCCTTCT 
CCCACGTAGC 
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GGGACAGATG 
TAGTTTAAAA 
ACAAGCGTCA 
TCTGGCAATT 
TGTTGCTCGA 
AAGTGTAAGC 
GTACACAGGC 
AACAAGTAGG 
AACTGTGGTC 
ATGGGCCAGA 
CCTCGACCAC 
CTGGCCGCGC 
AAGACCCGCC 
GTGGACCCTG 
CGCGCTCTCA 
GCTGGAGGAT 
GTGCGAGGGA 
GCCATCAGAG 
GAACTACGGT 
CTCATTCTCG 
AGGAGCTGCG 
AACTGAAAGA 
ATTGGAGATT 
CTTCAGGAGG 
GAAAGGTTAC 
AATGGAGCTG 
CACCTCCCCC 
GAGATCTGAT 
GTTGTGCTGG 
TAAGGTGGGC 
TAACCTTTCT 



CTCTCTGCCC 
GCTTCATAAA 
CCTCACACAG 
GCAGAGCTGC 
TTGCAAACCT 
CTAACTTAAG 
CACCGGTGGG 
CAAGAGCGCC 
TCCAAGGTGC 
CAGCAAGAGC 
TTCCGGACAG 
GGCCCGGGGG 
ACGATCCGCC 
CGGAATTGTT 
GGTGTGTGTA 
GGGTGCTCAC 
AATTGCACGA 
AGCACCAGGA 
GCGAGAGGCG 
GCTCCTTTTT 
GCAGGCTGAG 
ACAAGAAGTT 
CCGGTGAGGG 
CTTCTCTCTC 
AGGCTAGTAT 
TTTTTATTGT 
CGCCCCCAAT 
GGGTAATTGT 
CCTATTGGTG 
TAGAGACAGT 
CTTCCCGTAG 



TGCACTGCCT 
TGGTCTCAAG 
GCCACCCAGT 
CCAAGACCAC 
AAAGTTTAAA 
TAGCTAGAAG 
CCAGAGAACC 
TGGTGTTGTA 
CCCTCGACCC 
TCCGCCTACG 
GGCGCAGGCG 
CCAGCCCCCA 
TGCCAAGTCC 
CGTGTTGACC 
AAGGGCAGGC 
TTGAAGCCCG 
GGCCCGAGCC 
GCTGATGGCC 
CGGGGCTGGG 
CCTCGCAGGA 
GTCCAGGCCC 
CTCAAACTGC 
AGGCTCTGGG 
TGCTCTGCAC 
TGCCGTGTGA 
GTAAGGGAGA 
CAGGTTGTCA 
CCGAGGAAGA 
GGACACTCAA 
AAGTTATGCA 
GAGGAGGCCA 
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AAAACTTCAT CACTCGGGAG AACCTGGAGG CACGGATAGA AGAGGCCTTG GACTCTCCGA 22 8 0 

AGAGTTATAA CTGGGCGGTC ACCAAAGAAG GGCAGGTGGT CAGGAACTGA GAACAGAGGC 2 34 0 

CTCTCAGGCC CAAATAAGGA CAGTGCTTGC CTAGGGACTG GATATTGGGG TAGAAATTGG 24 0 0 

TGCATCCCAG GAGGGTGGCA CAGCCTTGTC CAGAGCAGCC CCCATTCATT CTAGATTTGG 24 6 0 

CACCAGGTAT AGTACCTGTT CTGACACCAC ATACAAACTC CGGACAGCAT TAAACTCTGG 2 52 0 

GAAGTTCCTA TCACACAGAA GATCAGACTG GACTGTCCCC TCTAGAAGCC AAGAGCTGTC 2 58 0 

TCCTGAGTTT CTTGGAATAG TGTGAGCCCA ATGTTTCCTG CTTTTATAAA. TAAACTATTG 2 64 0 

GAAAGCAAAG CCTTTTGTTA TGTGGCTTGC TTTTTCTTGT TGTAGAATAA GTTTATTTGT 2 70 0 

CCCAGTTATT TGGGTCTTAA GGTTATTAGC CAAAAGCCAG TTCACCTAAC TGAGCCAGGA 2 76 0 

GTTAGTTATC TGCTTTGCTC AATCCTGGGC TTTGCTGGGT AGGGTCAGGT GTGTCCAAGG 2 82 0 

TCCAGAAAGC AAAAAGGGTG CCCCGTTTCT CCTGGGAAGG CTTCCCCGTC AGTGATTTCT 2 880 

GTAACCGGAC CCTGCCCTGA CACAGCGTCA TTGGACTACC CAGCAGACAG TAGACTCCAC 2 94 0 

TCTAAACCCG CTTCTTGCGG TCAGTTGCTG TCCTTCAGTG TGTGTAAGCA GTGGCCAGAC 3 0 00 

AGCACCCTTG GGTGTCATTT CAAGACTCTC TCACCTTGGT CTGCTTTACG TTTGGTTTGA 3 06 0 

TTTGGTTTGT TCTGGTTTTT GAGACGAGGC CTTTCACTGG AACCTGGCAC TCAGTATTTA 312 0 

GACTGCCCAG CCAGCTAGCC TCAGAGAATG CATCTGCGTA TGCTTGCCTG GCGCTGGAAT 318 0 

TCGGTGCACA TGGCTTTGAT GTGTACCGGG GATCAGACAC AGATGTTTCA TGAGTGCAGT 3 24 0 

GCATGCCTGT TAGTGGTAGA GCTC 3 2 64 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = '^Cosmid DNA" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
GCGGCCGCAT AATACGACTC ACTATAGGGA TCTGGTGGAG GACCTATGGC CCGCGAGCTA 60 



( 

GAGAAGTGGT TCTCAACCTT CCTAGTGCTG 
GGGA7\ACCCC CTCCTGCAAC CATAAAATAA 
TACTCTATTG CTATGAATTG TAAAATAAAT 
TGAAAGGGTC ATTCTACCCC TAAGAGGTCA 
AGTAACCTTC ACTTGAGTCC ATATCCTCCA 
TCAAGCCTCA TCAAAATGGG TCCATCCCCT 
TACGGTCACT GGAAGGAGGA TGTCTGAAGG 
TCAGGATCTG ACGAAGCAGG CTCGTCATGT 
TGCAAGCTGG AAAAGTACCC ACTGAGCCCG 
GAGCAGAGGG TCTGGAGGAC AGCAGTCCCA 
GGGGTGACTC TCTGGTGGAA AGAGTTGCCT 
ACAACTGCCT TTAGTACTAG CACTCTGAAG 
AGCCAGCTTG GTTTACACAG CAAGTTCTAG 
CAAACGGGGT TGAGAAAGGA CTCAGCAGTT 
GATCCCTTAG TCTGACCTCT TGGCATGTAA 
ACATCAATCT GCAAAGGGGG AGGGAGGAAG 
TAAGAGAATT CACTGCTCTT CCCAATAGCC 
GCCCACGAAC ACCTGTAATT, CTAGCCCCTA 
CACGTACACA TATACCTAAA AAATTAGGTG 
AATATCAAAT GGGTTAGACA GCAGCTCCAA 
ACTCTTGGCA CCCTCTTTGG GTCCAGAACC 
TGCAACCCAT GGCTCATTAA TTAGGAAGTC 
AGAACCCCTT TCACCTGTCT TCCTCCTCTC 
AGACAGAAGA AGGGAACGAG ACCATGAGCA 
TTTATGTGTG TGAAGTCTCA GAGAGGTCAC 
TGCCTGAACC TCCAAGTTTC CTCCAAGAAA 
GCCAGCTGCC CCAGCCTCTG CTGGCAGAAJV 
CAGGGTCTCA CTCTACAGCT CTGGGGGCCT 
CAAACCAACA ACAAAAACAA ACAAAACCCC 
AACAATGGCT GCTAGGAGTC AAACCCAGGT 



P0T/GB 
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AGACCCTTTA ACACAGTTCC TCGTGTTGTG 
TTTTTGTTAC TACTTCATAA CAAGTGTTGC 
GTGTCTTCCA ATGGTCTTAG ATGACTCCCG 
TGATCTACAG GTTGAGAACC ACTGATCTCC 
TGAAGGTATG GAAGTCAATA AAACTGAGCT 
GGTACAGTGT GAGTGGAAGA ATACCCACCA 
GTCTTAGATT GTGTCAAGGG GTCCTGGGTG 
TTCATGAAGA CTACAGGTAT GTGATTU^AAC 
TGTGGCTCTG CTGGGATTTG GAGGCATGAG 
GAAATAATCT ATGACTAAGA AGGCTGAACT 
TTTAAGAAGG AAGACATACC AGGCATAGCA 
GCAGAGGAAG TCCGATTTCT CTGAGTTCCA 
GCCAACTAGG GTTACATAGT GGACTCTCCT 
AGCTCAGTTA ACTCCAGTTC TAGGAAATAT 
GTGGTGCACA TACATATATG CACACAAAAT 
GGCTGGAGTC TGAAGAAATA GTTCAGTGGT 
AAATTCAGCT CCTAGCATCC ATGTCAGATG 
AACTCAGTGC CCCTTCACAA GACGGGGACA 
GTTTTTTTTT ATTTATAAGG TCAAATGCAG 
GCTGGCCTCT TCCTCCCAGG GCTCTTCTTG 
CAGACATTAG CCATGACTCA GCTGATAAAA 
TGTAATTAGC CTGTCTGGTA GCCTCCAGAG 
ACCCAGGGGA AGAGCTCAGT TTTGCCCCTG 
ACGGGAAATG AGATGCTGGC GCACACACAC 
CAATAATGAG GCAATGGAAA TGAGCTGAGC 
ACCCCACAGG GGAGATGGGG CATGGCCCAG 
GTGAGCCCGC TGCCATTTTA ATTTTTGATA 
AAAACTCACT ATGTAGACTT CAAACTCAAC 
TGCACTGACT GGAGAGATGG CTCGGTTGAG 
CCTGTGGAAG AGCATGCTGG TAACTGCTGG 
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120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 



r 

GTCATCGCTG GGTCACTCTC TTCACACACA 

AACTCTTCAG TGTCTTGATT TACGGTTTCT 

GGCTCATTTG GC7U\ATGCTT GCCTGAGACC 

GAAGGAAAGG GCTCCACAAA GCTCTCTTCT 

GCACACATTC ATAATAGTGA TGAATGAAAA 

AAACTGTTAG CACGTTCAGT CAATGGCTTT 

CAGTCACTAG GCTCATACTG GTCAGA.CGCT 

AAGGTAGCAG AGGTCATTTG GCTCTGACTC 

GTTCTACTTG GTCGTA.GAAA AAAGCTGAGC 

AAAGATATAT TTATTTTATG TATATGAGTG 

GGGGCATCGG ATCCCATTAC AGATGGCTGT 

TTAGGACCTC TGGAAGAGCA GTCAGTGCTC 

AACTTTATTT TGAACATGCA ACCCCACCTA 

GAATAAAATT GGAGAAAATA AGCTTTATGG 

CAGAAGAATG GTTATGTTTT GTTTTCCCAT 

GCCACTTGCT CTGTAGATCA GGTTTCAATT 

CTGGGATAAG AACTAAGTCA CCCTGCCTAC 

TCAATTCCTT GTCAGCGAGA GAATAACTTC 

CCAAAAAGAC TGCGATGCTC TGAGACCTAT 

GTCAGAAATC CAAGGACTTG GTAAGCTGAG 

ACACACACAG ACACACACAC AGACACATAC 

ATGTGCCTGG ATGAGGCTTC AGTTCTTCAT 

TCTTCCTGAG GACTGGAGTC CTCA.CA.CCTT 

TGACCTCCTC TTTGCAGTCA CAGGCTGAAG 

TCTGTTTAAC ATGAACCTGC AAGGCAGTGG 
GATGCCAGGA ACGCTGTCCT CATGTGCCCT 

GGCCAGCCTG ACCGTGTGTG CCAGCCAGAA 

TTTGTCTTTC CTTGTGAGTT TCTTGTCACC 

CCTCTATGGG AAGATGGACA AGA.CTTTTTT 
ATACCTCTCA GGGGCAGGGG TCTTGTCCCT 

CTATGAAATC TACCCAACTT GTCTCTGTAC 
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CACACACACA CACACACACA CACGGCAATG 
TCCGATAAAT CCTCAGGAGG GCAGTCAAGT 
TGAGTTTGGT TCCCAGAACC CATGGAGGCA 
GAACTCCATA TGTGCACACA CACCCACTTC 
TGAAGACAGA TTVAAAAAAAC CAATTTCGTG 
GGGGGTAACC TGTTTCAGAG CCATGGTACT 
GAGGTCAGCA ATGGAGAGCT GCTACACCTA 
AGAATATTCC AGCTCTCCAC ATTCACAGAA 

^rprprprprprprprprp XTTTGGAACT TTATTTTTTT 

CACTGTAGCT GTCTTCAGAC ACACCAGAAG 
GAGCCAACAT GTGGTCGCTG GGGATTGAAC 
TTAACCGCTG AGCCATCTCT CCAGCCCTGG 
CCACTATGGG TTCAGTCACC AGCGCCTTAG 
TTAGTCAGCT GTCAGCTGTG GGGTTGGGGA 
CT^GGCCTCA CTCTGTGACC TGGTTGGTCT 
ACAGAGATCC ACCTGCTCCG TGTCGCTATG 
CTTATTACTT TGTATTCTTG GGCATGGAAC 
CTCGATCGGA GTGTTTTTAT GTGAATTGGG 
TTGTGAAGCC AAGAGTAGTG GGTAGCACAA 
ACAGTAGGAT GTGTGCGCTC ATGCACACAC 
ATGCATGGAC GCACAGAGGC ACCCACGCAC 
AAAGCTGCCT TTGAGTTTGT GCCCTCCCAC 
GGGCTGATAG TGCACCACTA CCTTTTTTAG 
GTACAGGGAG GACTCTAGCG GCCGTCTGCC 
GCAGCCTCAC CCCTAGCGAT GGCACTGAGT 
TGGCTGTTGG GGCACAGTGT GCCTCTGCAG 
TGCACAATTT CTGCCCGACC TTGGAAGCTT 
CAGCAGTGTT TCTTGCCTCT TTGCTTGACG 
TTTTCTACAT CCCCTGCAAA CAGGTTTGTC 
GTCAAGCGCA GCAGGCCACC AGACCCAGAA 
AAAGTTAAAC AACAAAAAGA AACTTGGTTT 
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1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 - 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 



PCT/GB 

C 



TGTTTTTGTT 
TAGTCCTGGC 
GCCTGACTCT 
GGGTTTTAAA 
CTGTACATGA 
GGAACTGAAG 
GCTCTTCTTG 
GGATTTAAAG 
AACTCTGGAG 
AGTGCCGCAT 
TTTAGCTCAG 
TCCGAAG/iAA 
TGGAGGCAGA 
TTTAGGACAT 
AGACAGAAAC 
TGAGCATGGT 
TCAGACTCAG 
AATTGGGGTT 
CACAGCAATG 
TCTCTATAAT 
TGCTGGTGTT 
AATACCCCAC 
TACTGGGGGT 
GGAGAGGGGG 
CCCAGCTCTG 
ACTCTGCTTC 
AAAATGTTTT 
ACAGCTATTT 
CTATGTTAAA 
CAAGTCTGGC 



TTTTTTTTTG 
TATTCTGGAA 

AGATTATGGT 
CAGGTGTGCC 
T TAG AG AC AG 
AGTAGCAAGT 
GTAGAACT^AG 
G TAG AG AC AG 
GATAGCAAGA 
TGGTAGAGCG 
AAAAAGAGAC 
GGTAGTGGAT 
CCAGGATTAC 
TCTCCTAACG 
AGCACATGTT 
CCGGAGAGCA 
AGCACATCAA 
TCTGGAAAGC 
TGAAAAGTCC 
CTCCATGTCT 
AGAAAGGAGG 
ATTAACCAGC 
TCTGGAGAGG 
TTCCCAGCAC 
CCTCTGAGAG 
AAAGTGAGAG 
TAATAGTGAT 
TCAGAAAGAC 
TTGTCAGGGA 



TTTTGTTTTG 
CTTGTTCTAT 
TGCTGGTCAC 
TTATATTTAA 
TGGTGTTTAC 
TCGTGAGCCA 
GCTTCTAACC 
GTTTGTCACC 
ATGATTCTCT 
AGAAGATCCT 
CTTGCCTAGC 
GAGAGCCAGT 
CTCTCTTGAG 
ACGCACAGAA 
TAGACCGCCA 
TGTAATCCCA 
AGTTCACGGC 
GTAAGTT^CC 
TAAGTCTGGT 
ACTGCTTGGC 
TAGGGCTTTT 
GCAGGGTGGA 
TCTGTACCCC 
TTGTTAACTG 
CCATTTCAGT 
CTCTGTACTT 
ACGCTCTAGA 
TTCAGGTTAG 
CCAAAGCCAA 
AAGGCTCAGA 



18 

TTTTGTTTTT 
AGACCAGGCT 
TCTGAAGATC 
TGTGTATGAG 
AGAGGCCAGA 
TCTCGGGGTT 
GCTTAGGCCT 
TGTCCTGGAG 
ATGAAGTTCA 
GTCTTTAAAA 
AAGCACAAGG 
GGTTGGTGCA 
TTCAAGGACA 
ACCTTGTCTC 
CACCTGATTT 
GCAGACATGT 
TAGACTGGAC 
CTGGT^AACAA 
TCCAAGGCCC 
AAAAACTCCC 
ATCCTAGAAG 
GGGGTAAGGG 
ATCCACACAG 
■GCCCAGCAGT 
GGCTCACAAC 
AACAGGGACA 
CAGGCTAGCA 
AACCCTGGGG 
TCTGGTGGAA 
ATGAAGGTTT 



TGAGACAGGG 
ATCCTGGAAC 
TGTGCCACCA 
TGTTTTGTTT 
AGAACATACC 
GCTGGGGACA 
CTCTGCAGCC 
ACCCTGGCCT 
GGCGAGCCTG 
GAG AC GAG AG 
CCCTGGGTTC 
CGTCTTTGAT 
GCGTGGTCTA 
ATAAAACAAC 
TTAAAAGCTC 
GGGGAGACAA 
CATTCCTACA 
GTTTGACTTG 
CCCCTTCCTC 
AGGACTATAT 
GAATTCAAAC 
AGAGAGGAGG 
ACCCAAGTTA 
TTGGCCTGCT 
TTTTAACTCC 
CACAGACACA 
AGTATTGAGT 
AGGGGGAACC 
GCTGCCATTG 
GAGCTGGGGC 



TTTCTCTATG 
TCAAAGAACG 
TCATCAGGCT 
GCATGTATAT 
AGATCCCCCT 
GAATCCGAGG 
CCCACTTACA 
TTAATTCCAG 
GTCTACACAG 
GGGTTGGGGA 
GGTCCCCAGC 
CCCAGTACTC 
CAAAGTGAGT 
AAACAAGACA 
TCAGTGAAAC 
AGGAATGGAC 
ATGAGGTAGG 
TCCAAGGTCA 
CCTCTCTCCC 
TAAACACAAA 
ACACAACACG 
AACTTCAGGC 
GAAAAGAGCA 
CTTGCAGGGG 
AGCCCCAAGG 
TACAATTAAA 
TGTGGCAGGT 
AGGAGTTAAA 
GAGGTTCTAA 
ATCATTAGTG 
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3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
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TATAAAAA.GT 


AT G AAAAC AC 


TCTAGGAAUA 


rt.tjAOM^O-rt.00 


AGGAACACCA 


CGGAGAGCGA 


5580 


GCCTTACGAT 


Gil GCAtjCAL- 


/^'Ti7\/~'7\^r^OP'Zi 




AGGAAACCAA 


GCACAGGGAC 


5640 


CAGGAAAGCC 








CATCCTGGGA 


AACTAGGCTG 


5700 


GGAGAGGCCG 


TG 1 1 AAA X PiJ\ 


A L:; A C Ho HO 


aPPP ATP A A A 


ATGACCCACA 


GAGGGCTTCA 


5760 


TGATTACAAT 


AGTAi i IGAl 






CTGAATGCAG 


GGGATTACAG 


5820 


AGTA^lATGCT 


GACTTTTGGA 


1 AAGAAl ObC 




ACAGGTGTGT 


GACTCACATC 


5880 


TTTAAAGCAC 


ACTCCCAGGG 


CAGAGG 1 Ao 1 


kjAo 111 w/^O 1 


TTAGGGGCTA 


GTCTGGTCTG 


5940 


GACTGGAAAT 


TCTA-.T GAG AC 


m rn rn T" 7\ 

CCTG i iC i GA 


HAAA^^ i rt-rt-riO 


TATTTGGGAA 


AAAAGAACTT 


6000 


CTGAGGGAAA 


TGGAGGCCGT 


G 1 AGG i G i C i 




PTGCGGCAGG 


TGGCGAGGGA 


6060 


GGATCTGAAA 


TGGGGAGAG i 


GAGGAGAC 1 Cjr 


r'TPPAPPTTT 
J. o^jr^^^v^ 1 i X 


PPTAGCCAGC 


AGAGATGCTA 


6120 


AGGCAGGTGA 


AGATTAGGTC 


1 GAl GGACC 1 


oA0A0v--'^O -L O 


PACACAGGCA 


GCATGGCGCC 


6180 


TTCAAAGCTC 


TAGTGGATGT 


GAi iGCCCCA 


r'APAAPTPTP 


CCCCAAAGCT 


CATCTTCGTC 


6240 


CATTAATAGA 


TV 7\ TV 7\ rn *Ti 

AAAAAGGTT T 


G i i G i (jACCA 




TCTCTCTGGA 


AAACAATCAC 


6300 


TTAACAAGGA 




AGGAALjC i (jC 


TPTPPPATPA 


P ATPACCATG 


ACGCAAGCAC 


6360 


TTCCCTTGGG 


GTTCATACGC 


AG i LlAC i CAtj 


TPPTPAPPAP 


CCTGTGCTAG 


GCTTGGCCCT 


6420 


CACTCCTTTT 


CCGCTGGAAT 


i AAG i GGGvjA 


PTPAPAPAPP 


PPAGAGGACC 


TGCCCAAGCC 


6480 


AGA7\AGCTTC 


AAGCCACAGG 


AGGGAG i G i G 




PPPTACACAT 


GAGCTGTCTC 


6540 


TTATCCTCGA 


TCGAGGGCCT 


7\ 7\ O rn /"< 7\ rp rp 

GAGAG i CAi i 


PPTPAAAAPA 


TPTGGCCCCC 


AGCCCTGAGT 


6600 


ATGGAAGGCT 


AACTTGGCTA 


CGAG i GGGCA 




T APPAAGAGG 


CAAAACCGTC 


6660 


CTCTGGCACT 


CTCTTGAAGC 


Al AG 1 tjCj i A i 


ZITPPPAPAPA 


PPTAACAGGA 


GCCGATGGGA 


6720 


GCTGGGAGGG 


TCCTGGCCTA 


GGCATAG i G i 


ALjAA^jAI^^ 1 1 


PPPT AAPTAG 


TCTGGGTCCC 


6780 


CAAACCATAA 


CATTTTTCTG 


GTGAG i AAALf 


AAAALjoAo i 


TpT AAPPPTA 


AAGCAGAATG 


6840 


TGGTGATACA 


CGCCTACAGT 




A'-JJ A'-J w i ooA 


PATAPAAAPA 


TCAAGAGTTC 


6900 


AATGCCAGCT 


TTCTGCTATG 


m T\ rn 7\ 7\ rn 

TAGTAAGGTC 


AAUo i CAtjL^U 


TPP APT A A AP 


GACTGCCTTA 


6960 


GAAACAACCA 


AATGACTTAC 


fn rn 7\ 7\ 7\ ^ rn 

CGTCTAAAGT 


CAooAAO 1 AO 


APTTPPTTTP 


TCAGACTGTG 


7020 


TCTGTCTGTC 


TGGGGCTCCT 


7\ rn rn m rn 

CCCATTTGG i 


r^rnz-^r^rp-r^ Ap"7\7\ 

C 1 C O i AAO AA 


PATPPAPTTP 


CACTCCTGCC 


7080 


TTAGATCTGA 


GATAGTACCA 


GCCTCAGGGG 


AiOOlOlOiOiO 


PPPATAPPTT 


TTCCTCTGCA 


7140 


GTACTGTGGG 


CTCACCTAGG 


ACTGTTTCTG 


AACTATATCC 


TACCCTAGCT 


CTCTACCCTA 


1 '~) r\ r\ 


GAAGGCCTGA 


AACTCACAGA 


AATTCTCCTG 


CCTCTGCTTT 


CCAATGGCTG 


GGGTTAAAAG 


7260 


CATGTGTCAC 


AACTGTCCTT 


TTTATTCTTT 


TAATATCGAG 


ACAGGGTCTC 


ACCAAGTTGC 


7320 


CCCAAGACGC 


CAGCCACACC 


TGGGACAGGG 


CAGGCCTTTG 


GCTCTATGTT 


CAGTCTTGAC 


7380 
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TCCATGACTG 


TGGCCGCTAG 


CCCATGAGGC 


TGCGCGTGGG 


AATTTCCTTC 


TGAAAGCTCA 


7440 


CCTGGTATCG 


ATGCTTCCTC 


TTATCCTACA 


CCACAACTAA 


CAAACCTGCC 


CCACCTCCTG 


7500 


GTCCTGACCC 


TGCTGCAGAC 


CTGCTAGTCC 


TTGGTGAATG 


AGACCTGGGG 


ACCCCTCTAG 


7560 


TCTGTTGAGA 


GCTGCTGAAA 


TGCTCAACTA 


TGATTTCCAG 


GTGACCCTCA 


AGTCGGCTCA 


7620 


CCTCCCTGAT 


TGCACAGCAC 


CAATCACTGT 


GGCGGTGGCT 


CCCGTCACAC 


GGTGGCCAGT 


7680 


GACAGCCTGA 


TGGCTGGCTC 


CCCTCCTCCA 


CCACCCTCTG 


CATTGACAGG 


CCCACGTGTG 


7740 


TCCCCAGATG 


CCTGAATCAC 


TGCTGACAGC 


TTGGGACCTG 


TCAGCTGTGG 


GCTCCTGGGG 


7800 


AGCCACTGGG 


GAGGGGGTTA 


GCAGCCACGC 


TGTCGCCTCC 


TAGCCAACAC 


CTGCAGACAT 


7860 


AAATAGACAG 


CCCAGCCCGC 


TCAGGCAGCA 


GAGCAGAGCT 


GCACGACGCG 


TCGATCCCAA 


7920 


GGCCCAACTC 


CCCGAACCAC 


TCAGGGTCCT 


GTGGACAGCT 


CACCTAGCTG 


CAATGGCTAC 


7980 


AGGTAAGCGC 


CCCTAAAATC 


CCTTTGGCAC 


AATGTGTCCT 


GAGGGGAGAG 


GCAGCGACCT 


8040 


GTAGATGGGA 


CGGGGGCACT 


T^CCCTCAGG 


GTTTGGGGTT 


CTGAATGTGA 


GTATCGCCAT 


8100 


CTAAGCCCAG 


TATTTGGCCA 


ATCTCAGAAA 


GCTCCTGGCT 


CCCTGGAGGA 


TGGAGAGAGA 


8160 




GCTCCTGGAG 


CAGGGAGAGT 


GTTGGCCTCT 


TGCTCTCCGG 


CTCCCTCTGT 


8220 


TGCCCTCTGG 


TTTCTCCCCA 


GGCTCCCGGA 


CGTCCCTGCT 


CCTGGCTTTT 


GGCCTGCTCT 


8280 


GCCTGCCCTG 


GCTTCAAGAG 


GGCAGTGCCT 


TCCCAACCAT 


TCCCTTATCC 


AGGCTTTTTG 


8340 


ACAACGCTAT 


GCTCCGCGCC 


CATCGTCTGC 


ACCAGCTGGC 


CTTTGACACC 


TACCAGGAGT 


8400 


TTGTAAGCTC 


TTGGGGAATG 


GGTGCGCATC 


AGGGGTGGCA 


GGAAGGGGTG 


ACTTTCCCCC 


84 60 


GCTGG7VAATA 


AGAGGAGGAG 


ACTAAGGAGC 


TCAGGGTTTT 


TCCCGACCGC 


GAAAATGCAG 


8520 


GCAGA.TGAGC 


ACACGCTGAG 


CTAGGTTCCC 


AGAAAAGTAA 


AATGGGAGCA 


GGTCTCAGCT 


8580 


CAGACCTTGG 


TGGGCGGTCC 


TTCTCCTAGG 


AAGAAGCCTA 


TATCCCAAAG 


GAACAGAAGT 


8640 


ATTCATTCCT 


GCAGAACCCC 


CAGACCTCCC 


TCTGTTTCTC 


AGAGTCTATT 


CCGACACCCT 


8700 


CCAACAGGGA 


GGAAACACAA 


CAGAAATCCG 


TGAGTGGATG 


CCTTCTCCCC 


AGGCGGGGAT 


8760 


GGGGGAGACC 


TGTAGTCAGA 


GCCCCCGGGC 


AGCACAGCCA 


ATGCCCGTCC 


TTGCCCCTGC 


8820 


AGAACCTAGA 


GCTGCTCCGC 


ATCTCCCTGC 


TGCTCATCCA 


GTCGTGGCTG 


GAGCCCGTGC 


8880 


AGTTCCTCAG 


GAGTGTCTTC 


GCCAACAGCC 


TGGTGTACGG 


CGCCTCTGAC 


AGCAACGTCT 


8940 


ATGACCTCCT 


AAAGGACCTA 


GAGGAAGGCA 


TCCAAACGCT 


GATGGGGGTG 


AGGGTGGCGC 


9000 


CAGGGGTCCC 


CAATCCTGGA 


GCCCCACTGA 


CTTTGAGAGA 


CTGTGTTAGA 


GAAACACTGG 


9060 


CTGCCCTCTT 


TTTAGCAGTC 


AGGCCCTGAC 


CCAAGAGAAC 


TCACCTTATT 


CTTCATTTCC 


9120 


CCTCGTGAAT 


CCTCCAGGCC 


TTTCTCTACA 


CTGAAGGGGA 


GGGAGGAAAA 


TGAATGAATG 


9180 
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AGAAAGGGAG GGAACAGTAC CCAAGCGCTT GGCCTCTCCT TCTCTTCCTT CACTTTGCAG 924 0 

AGGCTGGAAG ATGGCAGCCC CCGGACTGGG CAGATCTTCA AGCAGACCTA CAGCAAGTTC 930 0 

GACACAAACT CACACAACGA TGACGCACTA CTCAAGAACT ACGGGCTGCT CTACTGCTTC 93 60 

AGGAAGGACA TGGACAAGGT CGAGACATTC CTGCGCATCG TGCAGTGCCG- CTCTGTGGAG 94 2 0 

GGCAGCTGTG GCTTCTAGCT GCCCGGGTGG CATCCCTGTG ACCCCTCCCC AGTGCCTCTC 94 8 0 

CTGGCCCTGG AAGTTGCCAC TCCAGTGCCC ACCAGCCTTG TCCTAATAAA ATTAAGTTGC 954 0 

ATCATTTTGT CTGACTAGGT GTCCTTCTAT AATGACGCGT CGTGCCCACC TATGCTCGCC 9 600 

ATGATGCTCA ACACTACGCT CTCTGCTTGC TTCCTGAGCC TGCTGGCCCT CACCTCTGCC 9660 

TGCTACTTCC AGAACTGCCC AAGAGGAGGC AAGAGGGCCA CATCCGACAT GGAGCTGAGA 97 2 0 

CAGGTACCAC TGTGGTCCGT TCAGGGCTGC TGACAGTGCC GTAGGAAGGG TCATGGGCTA 97 8 0 

GGAGAGAGGG AAACCTTGTC TGAGCAGTCA GACTTTAGGG GAGGTTCCTG GAAGGAAGCA 98 4 0 

GTTATCTTAT ATGGAGTAGA TGGGTTTCCC AGT^^^CGGTAA GAGGGGACCA GGTGCCAGAG 9 900 

AAGCCACATA AAGGACAGTG TCCCCAGGCA GGGGATATGC CAGAAAATGA GAGATACTTA 9 9 60 

TCACTGGGCT TGGGATGAGA ACGGGTTAAA CTGGGTACCC TGGCCTCCTC TGCACAGCTG 1002 0 

GAGGTGGCCG GTGGTATGTT GGCTCACCAG GACTGGGTAG ATGGTACGAA ACTGTTCTCG 1008 0 

CCTGAGTACA AAGCCTTTCC CACCCAGCTC AAACTCTCTT AGCTCCTTTT TTAGCCAGCT 1014 0 

GCACCGGTTT CTTCCTGTCC ACGGAAGACG GCCATTGCCC TGTGTCTGAG CGGAGTATGT 10200 

CCCACATCTA GCCTCAGCCT CGTGCCCAGA TCTGCTGTAC TGTATGTTCA GCTCTGAGTC 10260 

TGCCCTTCCG GCAGGGCTGA AGGGAATCCA GTCACTAGGC TCAAATCTGG TCAGGTCACA 10 320 

GGTGGCTCAG TTTTGAACAA GCTCGATGGG CAGTAGGCAG TTCACCGAGT CTGCCTTCCG 1038 0 

TTTGCTGAGT TCCTTTGGAG ACTTCCGAGG CACTAGGTGT GTCTTGCACC CATCAGCCTA 104 4 0 

ATTCGGTCCT TGCCACCTTC CTACTAGGGC ATAATAGGTT GGCGGGAGGT AAAAGCCCAC 10500 

CAGCGTGGGG CAGGGGTA_AG AGTGAGCGAG CCGTAGGTAC AGGA7\AGAGG ATCTTGGAAT 105 60 

GTGTAGGGCC ATCTGAATGT CGGAGAGGTA AGTCTCTGAG AGACTGCTGC ACACCGGTGA 10 620 

CACATCAGAG CTGAGGAGGT CCCCCAAGTG TTGTCTCCCC CGCCCCCCGC CCCATACGAC 10680 

TCTGTCAAAG CAGGAGAGGG TTTTGAGACC TCATGAGAAC TGATCCTCCT GATAACCTAG 10740 

CCGGTTAGAT TTCCACTCTC GCCCTTTACG GCTGCTTCGT CCTAGATAGA GCCAGAGCAT 10800 

CTGGCCGGTG AAGCTGGGAT AGCAGCAGGG TGACCTTAGG TTCCCAACGC CCCTCTTGGC 108 60 

CTGGCTCCAG CTGACCCGCG TCCTTCCCCG CAGTGTCTCC CCTGCGGCCC TGGCGGCAAA 10920 

GGGCGCTGCT TCGGGCCGAG CATCTGCTGC GCGGACGAGC TGGGCTGCTT CCTGGGCACC 10980 

GCCGAGGCGC TGCGCTGCCA GGAGGAGAAC TACCTGCCCT CGCCCTGCCA GTCTGGCCAG 11040 
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AAGCCTTGCG GAAGCGGAGG CCGCTGCGCT GCCGCGGGCA TCTGCTGCAG CGATGGTGCG 11100 

CACAAAGCCA GGCGGGCTGA GCATGGGGAA TGGATGGGGT GGGTGGGAGG TA^^GGGGGG 11160 

CTAAGTGGGG GACTGAGGAA TCAGGACCGG AGATGGAGGG TGAGTAGTAT GAAGGGGGTC 112 2 0 

GAGAGTTGGA ACGTAGCAGG GTAGGATAAA GGGGATTGTG GGGATGGCGC CCCTATAGGT 112 8 0 

GCGCCCACCC CAGGACGCCT GACCTCACAC AGCCCTTCCT TCAGAGAGCT GCGTGGCCGA 11340 

GCCCGAGTGT CGAGAGGGTT TTTTCCGCCT CACCCGCGCT CGGGAGCAGA GCAACGCCAC 114 00 

GCAGCTGGAC GGGCCAGCCC GGGAGCTGCT GCTTAGGCTG GTACAGCTGG CTGGGACACA 114 60 

AGAGTCCGTG GATTCTGCCA AGCCCCGGGT CTACTGAGCC ATCGCCCCCC ACGCCTCCCC 11520 

CCTACAGCAT GGAAAAT7\AA CTTTTAAAAA ATGCACCCTG GTGTCTGTCT CTCTTTCTGG 11580 

GGTGGGGAGA AAAGGGGGGA GAGGAATTGG AGTGGGAACT TTCTACTCTG CTCTGACTGA 11640 

TCCCCACATC CAAAGTCGTG CATAAGATAC GCCCCCACCG CCAGAAGGGG CAGAACCTAT 117 00 

AAGTCTTAGA GTATAAAGGA AGCTTCTGCT GCTCCTGGAT ACCCACATAA TACTCAGAAA 117 60 

TVAAAGGCAAG TCAGAAGAAG GGAAAGATCT GAGATCCAGA GGAGCCTGAA GGGTCAGGGT 11820 

GACTTAGCAA GTTTCTATCT GAGACCGATy^ TAAAAGGACA TTGTGGACAA GAGAAACAGA 1188 0 

GCAGGACATG AGGAGAGACA GGATCAGCAA GAGTGACAGA GAAAGAGGGG ACAGGCCAGG 11940 

GGTGGCCATC TCAGCCCTGA TTTCACCCAG ACTAAGGCAA AAACAACGTG T^^GGACTCTT 12000 

AACCAAGGCT GTGCTTGGAT GGGAGGAGAA GGTACAGAGA CATTACCCCA GACCTAf^GA 120 60 

AGACAATGCC ACCCGCCTTC TCTCCAGGTG CTCCACCATC AAGACCCAGC CACTGAGAGG 1212 0 

CAGACTCCAG T/VAGAGTCCA GCTACAAGTC CTCTACAGGC ACATGTTCAA ACCGTCACAC 1218 0 

CCACACTCAG GCAGGGAAAT AGACAAGATA GGCTGGAGTT GTGGCTCAGG AGCAGAAGTC 122 4 0 

TTCCCTAGCT ATGGTTCCAG CACCATGGAG ACGGAAGGGG GACTGAGCGG GGGGGGGGGC 12300 

GGGGGGGAGG AAAAAGTAGC AGCTACTAGG GGCATTTCTA TGACCCTTGT CCTCAACCAT 123 60 

AGCTAGAGAC CCAGAGGAAC ACAGAAGTCC AGCAGCAAGG CGCACATGCT TGCAATAGCT 124 20 

CCCAGACGTA AA.TACTTCAT TCCGTTCGGC ACATCCGGGT CATCAGCACT TGACTCCCCC 124 80 

CCCCACACTT CTTATTACCT CCTCTTTTTT CTT^U^ATTTT AGATTTATTC ACTTATGTAG 1254 0 

ATGGGTGTTT TTGGTTTTGT ATGAATGTCT GAGCACCATG TGTGTGCCTG GCGCCTCAGA 12600 

GGTCAGGAGA GGGCATCGGG TCCCCTGGAA CTAGTTACAG GTGGTTACAG CCTACCGTGT 12 6 60 

GGGCACTAGG AACTGAATCC CAGTCCTCTT AACTGACCCG CACATCCAAA CCCAGGCTTC 12720 

AGCCCCTCAT CAGCCTGTCC CTCCTCCAGG CCCTCAGGTG TCTCCCGTCT CCGGCTGCTC 127 80 

TCCCAGACAT CCTTCCATCC TCTGGTCTCC CTGCTCCTCG CCCTCCTGTT AACATCCTTT 128 4 0 



\ 

CTCTCTGCCC CATCTGTCCT GGGCATCCTC 
TTACCTCATT TGGGATGGCC TGCAGGTTCT 
GATTAGTCTG ATTGACTTAA GGTGGTTCAG 
TTTACCCAGA TGCCAGCTCT CTTCCCATCT 
CCGGTTTAGA CAGGTAGCAC AGGGGCCAGG 
GGCCCTCTTA GGGTCTGACC TCCAATAGGG 
GACAGAAAGA AGCGTGGCAG GCGGCATGGG 
ATGGGCAGCT TTAG^lATGAA GGTCAGATTC 
TGTAAGCGTC TCGCTTTCCT CCACCTGTTT 
AGAGAATTAT ACCTGCTCAC CCTACTTCTG 
TTCAGGGAGT CCTTTCCTCA GCTACAGAGC 
GATGCTTGCC TCCTCTCTTC AGCCTCCAGA 
CCTCTGACCT CCACACAGAC GCTGTGCTGT 
ATAGTGAAAA GTTIACTTAGA CCATTTTCAG 
CACTGCTCAJ\ CTTGAGCCTC GGGACCACAG 
TGCCTGGTGA GATTGGGACA CACAGAGGCA 
TTTGCCGAGG TCCCACACCC GCGGATCCCG 
GTGAGAAAGA GACCTCACCG CCTGGTCAGG 
GACCACCAAC ACTGCCCACC CCTGCCCACA 
CTCTGGGTTC CGTGGGGGAG GGCCCAGGAG 
AACCTGAGGG AAACAGACCG GATAAACAGT 
GCTGAACCTT CAGAGAGGCA CACAAGCCTT 
ACATCTCGGA CGCCAGAGGA AAACACCAAA 
CCTGGAAGGG GCGGCACAGG TCTTCCTGGT 
ACCCCGCCTG GTGAACTCAA GACACAGGCC 
AAAAACTACA CGCCCGAAAG CAGAACACTC 
ACAGGTCTAC AGCACTCCTG ACACACAGGC 
AGCAGAACAA AGTAACACTA GAGATAATCT 
CAACAGAAAC CAAGACTACA TGGCATCATC 
GGAATATCCA AACACACCAG AAAAGCAAGA 
GCTGCAGGAC TTCAAGAAAG ACGTGAAGAA 
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TCCTGCGAGC TGCAGCAAGG TCAGGATGGT 

GAGGTCAGGG GCAACTACAG KQKN:hKQyKC^K 

CAAGGTCAGC TCTGCCCAGA CTCACGGTCT 

CCTCGGTGCC TATACACCTC TCTGCATGCC 

CAGACTCCTA TCCCAGCCTC CTCCTTCTGT 

CAGGGCCAGG QhKQQQZZKQ ACCAAAAAGG 

CACACTTGAT TCAACCCCTA CGGCTGGTGT 

TCACTTCGAG CCTCTGCGCA GGTGGAGTGT 

CTGGAJ\GAJ\T CAGGCTCCTC TTCCTCGAGG 

CCTACTGGAC ATAATATATA TTTTTTTCCT 

CATTTAAGGG CACTCCCAGA GTTCACAGCA 

AGCAGAGAGG CTTGTGAGCA AATGCCAGGA 

GTGCACAGCC CTCAAGCACA CAGCGAAGCA 

GCTGGGGAGA TGGCTCAGGA GATAAGAGAT 

GTAAGACCAA CTTGTCTGCT GCT^GAGAGC 

GAGTTCATCT AGGACCGGGC ACGTCCTGTG 

GCCCGCAGCA GCTCTCTGCT CCCAGAACCC 

TGGGCACTCC TGAGGCTGCA GAGCGGAAGA 

TCCCTGGCCC AAGAGGAAAC TGTATAAGGC 

CGTCAGGACC CCTGCCTGAG ACACCGCCGG 

TCTCTGCACC CAAATCCCAT GGGAGGGAGA 

GGAAACCAGA AGAGACTGCT CTCTGTACAT 

GGCCATCTGG A_ACCCTGGTG CACTGAAGCT 

TGCTGCCGCC ACAGAGAGCC CTTGGGCAGC 

CACAGGAACA GCTGAAGACC TGCAGAGAGG 

TGTCCCCATA ACGGACTGAA AGAGAGGAAA 

TTATAGGACA GTCTAACCAC TGTCAGAAAT 

GATGGTGAGA GGCAAGCGCA GGAACCCAAG 

GGAGCCCAAT TCTCCCACCA 7U=lACAAACAT 

TCTAGTTTCA AAATCATATT TGATCATGAT 

CTCCCTTAGA GAACAAGTAG AAGCCTACAG 
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12900 
12960 
13020 
13080 
13140 
13200 
13260 
13320 
13380 
13440 
13500 
13560 
13620 
13680 
13740 
13800 
13860 
13920 
13980 
14040 
14100 
14160 
14220 
14280 
14340 
14400 
144 60 
14520 
14580 
14 640 
14700 



mm 
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AGAGGAATCG 
ATTAAAAATG 
AAACCA7VAGG 
GGAAGAGAGA 
TAATGTAAAG 
AAGATCAAAC 
AGTAAATGTC 
CATAGGCATA 
CCGTCACATA 
AAGGGAAAAA 
GCCAGAAACT 
CAAATGCCAG 
CAAGATATTC 
AAGGATAATA 
AAACTAATCG 
CCAAATATGA 
GGCCTTAACT 
GCATTCTGCT 
AAAGGCTGGA 
CTAATATCAA 
TTCATATTCA 
CCAAATACAA 
GCACCTCACA 
TGGAAACAGA 
TTAACGGATA 
CCTCATGGTA 
TACAGAAAGA 
TTCAATAACA 
AATGATAACC 
GAAAATGAAG 



CAAAJ\ATCCC 
GAAATAGAAG 
AAGAGACAAG 
ATCTCAAGAG 
CGGAAAAAGC 
CTAAGGAThA 
TTCAACAAAA 
CAAGAAGCCT 
ATAGTCAAAA 
GGTCAAGTAA 
ATGAAGGCCA 
CCCAGGTTAC 
CATGACAAAA 
AATGGTAAAG 
TCTTGGCAAC 
ATATAACGGG 
CCCCAATAAA 
GCCTACAGGA 
AAACAATTTT 
ATAAAATCAA 
TCAAAGGAAA 
GGGCACCTAC 
CAATAATAGT 
AATTAAACAG 
TTTATAGAAC 
CTTTCTCCAA 
TAGAAATAAT 
ATCAAGGAAG 
TGGTCAAGGA 
GTACAACATA 



TGAAAGAATT 
CAATCAAGAA 
GAGCCGTAGA 
CAGAJVGATTC 
TACTGGTCCA 
TAGGTATAGA 
TCATAGAAGA 
ACAGAACTCC 
CACCAAACGC 
CATATAAAGG 
GAAGATCCTG 
TGTATCCTGC 
CCAAATTTAC 
CCC7\ACATAA 
AAAACAAAGC 
AAGCAATAAT 
AAGACATAGA 
AACACACCTC 
CCAAGCAAAT 
TTTTTAACTA 
AATCCACCAA 
ATATGTAAAA 
GGGAGATTTC 
AGATGTAGAC 
ATTCTATCCT 
AATTGACCAT 
CCCATGCATG 
AATGCCCATA 
AGAAATAAAG 
CCCAAACTTA 
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CCAGGAAAAC 
AGAACACATG 
TACAAGCATC 
CATAGAAATC 
AAACATACAG 
AGAGAGTGAA 
AAACTTCCCT 
AAATAGATTG 
ACAAAATAAA 
CAGACCTATC 
GACAGATGTC 
AAAACTCTCA 
ACAATATCTT 
GGAGGCAAGC 
GAATGAAAGC 
CACTATTCCT 
TTAACAAACT 
AGAGACAAAG 
GGTCAGAAGA 
AAAGTCATCA 
GATGAACTCT 
GAAACCTTAC 
AACACCCCAC 
AGACTAAGAG 
AAAGCAAAAG 
ATAATTGGTC 
CTATCGGACC 
TATACTTGGA 
AAAGAAATTA 
TGGGACACAA 



ACAATCAAAC 
GAAACAACCC 
ACCAACAGAA 
ATTGACTCAA 
G7U\ATCCAGG 
GACTCCCAGC 
AACCTAAAAA 
GACCAGAAAA 
GAAAGAATAT 
AG AAT CACAO 
ATACAGACCC 
ATTAACATAG 
TCTACAAATC 
TATACCCTAG 
ACACAAACAT 
TAATATCTCT 
GGATACGCAA 
AC AG AC ATT A 
AGCAAGCTGG 
AAAAAGATAA 
CAATCCTAAA 
TAAAGCTCAA 
TCTCATCAAT 
AAGTCATGAG 
GATATACCTT 
AAAAAACGGG 
ACCACGGCCT 
AACTGAACAA 
AAAACTTTTT 
TGAAAGCTGT 



AGTTGAAGGA 
TGGATATAGA 
TACAAGAGAT 
CTGTCAAAGA 
ACTCAJVTGAG 
TCAAAGGACC 
AAGAGATACC 
GAAACACCTC 
TAAAAGCAGT 
CAGACTTCTC 
TAAGAGAACA 
ATGGAGAAAC 
CAGCACTACA 
AAGAAGCAAG 
AACCTCACAT 
CAACATAAAT 
CGAGGACCCT 
CCTCAGAGTG 
AGTAGCCATT 
GGAAGGACAC 
TATCTATGCC 
AACACACATT 
GGACAGATCA 
CCAAAGGGAC 
CTTCTCAGCT 
CCTCAACAGG 
AAAACTGGTC 
TGCTCTACTC 
AGAATTTAAT 
GCTAAGAGGA 



14760 
14820 
14880 
14 940 
15000 
15060 
15120 
15180 
15240 
15300 
15360 
15420 
15480 
15540 
15600 
15660 
15720 
15780 
15840 
15900 
15960 
16020 
16080 
16140 
16200 
16260 
16320 
16380 
16440 
16500 



i 

V 

AAACTCATAG CGCTGAGTGC CTGCAGAAAG 
ACAGCACACC T7VAAAGCTCT AGAACAAAAA 
CAGGAAATAA TCAAACTCAG AGCTGAAATC 
GAATCAACAG AACCAAAAGT TGGTTCTTTG 
CCAGACTAAT GAGAGGACAC AGAGAGTGTG 
GAGACATAAC AACAGATTCA GAGGAAATTC 
TATATTCAAC AAAACTTGAA AATCTTCAGG 
TACCG/IAGTT AAATCAGG/IA CAGATAAACC 
TAGAAGCAGT' CATTAAAGGT CTCCCAACCA 
CAGAATTCTA TCAAACCTTC ATAGAAGACC 
AAATTG7VAAC AGATGGATCA CTACCGAATA 
CTAAAAAACA CAAAGACACA ACAAAGAAAG 
TCGACGCAAA AATACTCAAC AAAATTCTGG 
TCATCCACCA TGACCAAGTA GGCTTCATCC 
.AAACCATCAA CGTGATCCAT TATATAAACA 
CATTAGACGC TGAGAAAGCA TTTGACAAAA 
AAAGAATTGG AATTCAAGGC CCATACCTGA 
TTGCTAACAT TAAACTAAAT GGAGAGAAAC 
GACAAGGCTG CCCACTCTCT CCCTACTTAT 
CAATCAGACA ACAAAAGGAG GTCAAGGGGA 
CACTATTTGC AGATGATATG ATAGTATATT 
TACTAAAGCT GATAGACAAC TTCAGCAAAG 
CAGTTGCCTT CC TOT AT AC A AAAGAGAAAC 
CCTTCATAAT AGACCCAAAT AATATAAAGT 
AAGATCTGTA CAATAAGAAC TTCAAGACAC 
GATGGAAAGA TCTCCCGTGC TCATGGATTG 
TACCAAAAGC AATCTACAGA TTCAATGCAA 
AAGAGTTAGA CAGAACAATT TGCAT^ATTCA 
AAGCTATCCT CAACAATAAA AGGACTTCAG 
ATTACAGAGC AATAGTGATA AAAACTGCAT 
AATGGAATAG AATTGAAGAC CCAGAAATGA 
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AAACAGGAAA GAGCATATGT CAGCAGCTTG 

GAAGCAAATA CACCCAGGAG GAGTAGAAGG 

AACCAAGTAG AAACAAAAGG ACCATAGAAA 

AG7LZ\AATCAA CAAGATAGAT AAACCCTTAG 

TCC7\AATTAA CAA7VATCAGA AATGAAAAGG 

AAAAAATCAT CAGATCTTAC TATAAAAACC 

AAATGGACAA TTTTCTAGAC AGATACCAGG 

AGTTAAACAA CCCCATAACT CCTAAGGAAA 

AAAAGAGCCC AGGTCCAGAC GGGTTTAGTG 

TCATACCAAT ATTATCCAAA CTATTCCACA 

CCTTCTACGA AGCCACAATT ACTCTTATAC 

AGAACTTCAG ACCAATTTCC CTTATGAATA 

CAAACCGAAT CCAAGAGCAC ATCAAAACAA 

CAGGCATGCA GGGATGGTTT AATATACGGA 

AACTGAAAGA ACAAAACCAC ATGATCATTT 

TTCAACACCC CTTCATGATA AAAGTCCTGG 

ACATAGTAAA AGCCATATAC AGCAAACCAG 

TTGAAGCAAT CCCACTAAAA TCAGGGACTA 

TCAATATAGT TCTTGAAGTT CTGGCCAGAG 

TACAGATCGG AAAAGAAGAA GTCAAAATAT 

TAAGTGATCC CTWVCATTCC ACCAGAGAAC 

TGGCTAGGTA TAAAATTAAC TCAAATAAAT 

AAGCCGAGAA AGAAATTAGG GAAACGACAC 

ACCTCGGTGT GACTTTAACA AAGCAAGTAA 

TGAAGAAGGA AATTGAAGAA GACCTCAGAA 

GCAGGATTAA TATAGTAAAA ATGGCCATTT 

TCCCCATCAA AATACCAATC CAATTCTTCA 

TCTGGAATAA CAAAAAACCC AGGATAGCTA 

GGGGAATCAC TATCCCTGAA CTCAAGCATG 

GGTATTGGTA CAGAGACAGA CAGATAGACC 

ACCCACACAC CTATGGTCAC TTGATTTTTG 
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16560 
16620 
16680 
16740 
16800 
16860 
16920 
16980 
17040 
17100 
17160 
17220 
17280 
17340 
17400 
17460 
17520 
17580 
17640 
17700 
17760 
17820 
17880 
17 940 
18000 
18060 
18120 
18180 
18240 
18300 
18360 
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ACAAAGGAGC 


CAAAACCATC 


AAATGGAAAA 


AAGATAGCAT 


TTTCAGCAAA 


TGGTGCTAGT 


18420 


TCAACTGGAG 


GTCAACATGT 


AGi\AGAATGA 


AGATCGATCC 


ATGCTTGTCA 


CCCTGTACAA 


18480 


GCTTAAGTCC 


AAGTGGATCA 


AGGACCTCCA 


CATCAAACCA 


GACACACTCA 


AACTAATAGA 


18540 


AGAAAAACTA 


GGG7y\GCATC 


TGGAACACAT 


GGGCACTGGA 


AAAAATTTCC 


TAAACAAAAC 


18600 


ACCATGGCTT 


ACGCTCTAAG 


ATCAAGAATC 


GACAAATGGG 


ATCTCATAAJl 


ACTGCAAAGC 


18 660 


AACTGTAAGG 


CAAAGGACAC 


TGTGGTTAGG 


ACAAAACGGC 


AACCAACAGA 


TTGGGAAAAT 


18720 


ATCTTTACCA 


ATCCTACAAC 


AGATAGAGGC 


CTTATATCCA 


AAATATACAA 


AGAACTCAAG 


18780 


AAGTTAGACC 


GCAGGGAAAC 


AAATAACCCT 


ATTAAAAAAT 


GGGGTTCAGA 


GCTAAACAAA 


18840 


GAATTCACAG 


CTGAGGAATG 


CCA7\ATGGCT 


GAGAAACACC 


TAAAGAAATG 


TTCAACATCT 


18900 


TTAGTCATAA 


GGGAAATGCA 


AATCAAAACA 


ACCGTGAGAT 


TTCACCTCAC 


ACCAGTGAGA 


18960 


ATGGCTATGA 


TCAAAJ^CTC 


AGGGGACAAC 


AGATGCTGGC 


GAGGATGTGG 


AGAAAGAGGA 


19020 


ACACTCCTCC 


ATTGTTGGTG 


GGATTGCAAA 


CTGGTACAAC 


CATTCTGGAA 


ATCAGTCTGG 


19080 


AGGTTCCTCA 


GAAAATTGGA 


CATTGAACTG 


CCTGAGGATC 


CAGCTATACC 


TCTCTTGGGC 


19140 


ATATACCCAA 


AAGATGCCCC 


AACATATAAA 


AAAGACACGT 


GCTCCACTAT 


GTTCATTGCA 


19200 


GCCTTATTTA 


TAATAGCCAG 


AAGCTGGAAA 


GAACCCAGAT 


GCCCTTCAAC 


AGAGGAATGG 


19260 


ATACAGAAAA 


TGTGGTACAT 


GTACACAATG 


GAATATTACT 


CAGCTATCAA 


AAACAACGAG 


19320 


TTTATGAAAT 


TCGTAGGCAA 


ATGGTTGGAA 


CTGGAAAATA 


TCATCCTGAG 


TAAGCTAACC 


19380 


CAATCACAGA 


AAGACATACA 


TGGTATGCAC 


TCATTGATAA 


GTGGCTATTA 


GCCCAAATGC 


19440 


TTGAATTACC 


CTAGATACCT 


AGAACAAATG 


AAACTCAAGA 


CGGATGATCA 


AAATGTGAAT 


19500 


GCTTCACTCC 


TTCTTTAAAA 


GGGGAACAAG 


AATACCCTTC 


GCAGGGAAGA 


GAGAGGCAAA 


19560 


GATTAAAAXA 


GAGAATGAAG 


GAACACCCAT 


TCAGAGC'CTG 


CCCCACATGT 


GGCCCATACA 


19620 


TATACAGCCA 


CCCAATTAGA 


CAAGATGGAT 


GAAGCAAAGA 


AGTGCAGACC 


GACAGGAGCC 


19680 


G GAT G TAG AT 


CGCTCCTGAG 


AG AC AC AG CC 


AGAATACAGC 


AAATACAGAG 


GCG7VATGCCA 


19740 


GCAGCAAACC 


ACTGAACTGA 


GAATAGGACC 


CCCGTTGAAG 


GAATCAGAGA 


AAGAACTGGA 


19800 


AGATCTTGAA 


GGGGCTCGAG 


ACCCCATATG 


TACAACAATG 


CTAAGCAACC 


AGAGCTTCCA 


19860 


GGGACTAAGC 


CACTACCTAA 


AGACTATACA 


TGGACTGACC 


CTGGACTCTG 


ACCTCATAGG 


19920 


TAGCAATGAA 


TATCCTAGTA 


AGAGCACCAG 


TGGAAGGAGA 


AGCCCTGGGT 


CCTGCTAAGA 


19980 


CTGAACCCCC 


AGTGAACTAG 


ACTGGTGGGG 


GGAGGGCGGC 


AATGGGGGGA 


GGGTTGGGAG 


20040 


GGGAACACCA 


TAAGGAAGGG 


GAGGGGGGAG 


GGGGATGTTT 


GCCCGGATAC 


CGAAAGGGAA 


20100 


TAACATCGAA 


ATGTATATAA 


GAATACTCAA 


GTTAATAAAA 


AAAAAAAAAA 


AAAAAGAGAT 


20160 



CACTGCTCTT GCAGAGGCCC CCAGTTCTGT 

CACCTGTAAC CCCATTTCAG GGGATCTGAT 

TCATTTACAA ATACTTTCAC KCKQKQPJZKZ 

ACATCGGAAC ATTAAAAACA AAACA7\7^?ICA 

TGGAGAGATG ACTCAGTGGT TAAGAGCACT 

TCCCAGCAAC TACATGGTGG CTCACAACCA 

GACAGTGACA GTGTACCCAC ATACATGAAA 

AAAGTGATGA AC TC TAT TAG CACCAAAAAG 

TCTTGAAGTC TCTTTCGCAT ATCTCTTTGG 

QQQQZQQK^Q GGCATCTGTG TTCATTTGCT 

ACGTGTTTAC TTGTTTTCTA GATGGGATGG 

CTTGCTCTCC TGAGCTCTAC CCAAGCAGTC 

CTCTCTGCTT TGTGGTCATT TGTGCCCACT 

CGCTGTCCTG CAGAGGCAGA CACAGGGCCC 

ACGGAGAGGC CAGTTGATTC TTGATATTTT 

AATTCAGCTC TGTAGAAACC TTATGGTTCT 

CTTCTAGCCC GGGATTACCC CCTCAACCTC 

ATGGAAGTGA CTGGATGAGT CCCTTGCGGC 

GTTTGGCTTT GCCCTTTCCC ACAGAAGTCC 

TCCTAAGTCT GTCTGCCGCA GGCCTTACCC 

AGATGACAAC CTGTAGAGAC CCCTCGGTCC 

TG7U\TTGCCA ATGTCCATGG CGTTCCCGGT 

GACGCCAGTC AAGTTTGAAA AGGAAATCTG 

ATTTTTTTCT GGTAGTGGCC ATGGGGAACG 

GCAGGGCTCA GCGGCCGGTC TCAGTGCTGG 

TCCTCGTGGC AGCCGTCTGC ACAGAGAGAC 

CCTGGTCCTG GAGCCCCGGT CCCGTCTCGA 

ATGCCGGCGG CGGCGCAGCG GCCCCCGCTC 

GACGGCAGGT AGTTCTCCTC TTGGCAGCGC 

AGCTCGTCCC CGCAGCAGAT GCTGGGCCCG 

GGGAGACACT GGTGGGAGGG AAGGGATGAG 
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TCCCCACAAC CTCTTAGGAT GACTCACAAC 
GCCCTCTTCT GGTCCCCATG GGCACTGCAC 
ATGCACAGAA ATGGAAATTT AACAGAATAA 
AAACAAAAAA CAAAAAAAAC CCCATAGGAC 
GACTGCTCTT CCAGAGGTCC TGAGTTTTWl 
TCTGTAATGG TCTCTTCTGG TGTGCCTGAA 
TAAATAAATC TTTAAAAAAA AAAAGCCCAG 
AAAAAAAAGA AAAAGAAAAA CTCAAATCAA 
CCTCCACCCT GTCTGTGGAT CCCACATGTG 
GAGTGTGAGA GCCACATAAA GTGCTGGTTT 
AGCCCAGGAC CTTAACCTTG TGGGGCAAGA 
TGGATTGCGG GTTTCCTGTT TGTCTGTGAG 
GGCTCTTAGA TCCCATGACT TCCCAGAGAA 
CTGAGCTCAG GCCCGGCCCT GGAGACAGAA 
CCGTTGTGGC TTCCTTGGGG CCTGTGTGAG 
GCAACTACCC TCCCCTGGCC AAGACCCTTT 
TGAGGTCGCC GCCAAGGTCT CCTTCCAGAT 
CTCGCCTGCC TTCCCATCAC CZKQQCZCC'Y 
ACCATTGCTG TTTGGACTTC CT^AATGGTGC 
CAGTCGGGAG TGGGAAACGG GCCTAACTGG 
TCCTAGCAGC CTGCTGGGCT GTTCTCCCTC 
GCCTTTCCTC CCTCCCGTTT CTGACAATTA 
CTTTATTTAT TTATTTATGT TTTTTTTTTA 
AAGGAAGCGC CCTAAAGGTA TCATCACAAA 
GAGAAGGCGG CCTCAGGGTC GCAGGCGGGG 
GCCGAGTCAG GGCCGCCCTG GCCCAGCCCG 
CCCCTGCCCG ACTCACCCGG GCTGCAGCAG 
CCGCAGGGCT TCTGGCCGGA CTGGCAGGGC 
AGCGCCTCGG CCGTGCCCAC GAAGCAGCCC 
AAGCAGCGGC CTTTGCCCCC GGGGCCGCAG 
CCGGGGGCGG GAGGGGAGCG GCCGGGGAGG 
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20220 
20280 
20340 
20400 
204 60 
20520 
20580 
20640 
20700 
20760 
20820 
20880 
20940 
21000 
21060 
21120 
21180 
21240 
21300 
21360 
21420' 
21480 
21540 
21600 
21660 
21720 
21780 
21840 
21900 
21960 
22020 



PGT/SB 

{ 



GAGACCCTGT 
GGTGGCGGGT 
AAGCTCCTCC 
AAGGGGGCGG 
GAGGGCGGGG 
GGGCAGTTCT 
AGGCTGGAAC 
GGTGATGGTT 
CCCAGTCTAA 
CCACACTGTT 
CCTCAGCTAA 
AGAGGATAAA 
AGAAATCCAA 
GGATCTAAAG 
TCTTCTAACT 
CAGACCAAAA 
CTTACAACTC 
ACCAAAGGCC 
GGATCTGATG 
ATAACTTTAA 
TGAACTCACA 
TGAGAAAACA 
AGCCATTTGT 
GTACATACAA 
CTGCATGGGA 
ACCTTGGGTC 
AAATTTTAAC 
AGGATGTTGT 
CTGGGAACAA 
CCCCAGCTCC 



GGGGCGGGGG 
CTCCCTGGCT 
GCTCCCTCGA 
CGGGGGCGAC 
CTCTCACCGT 
GAATGTAGCA 
CTGCCATGGC 
TCTCCAGCCC 
GAGGGTGACT 
TAGAAGCAGG 
TGACCTGAGC 
GAGACTTGGA 
TACCTCAGAA 
CTTAGATGGG 
AGCCCGTAAA 
TAAAGTAAAA 
TGTATTAGGG 
CTGAGTTCAA 
CCCTCTTCTG 
AACTCTTTAA 
GTGATCTATT 
CAGGCCATCA 
CCTAAGGCCA 
CTAGGCATCT 
AGAATGCTAC 
CAGTCCCCAG 
TTTTATTTGT 
ATCCCCAGGA 
AATTCAGAAT 
TTGATTTTGC 



GCTGAGCCGG 
CTCTCTTTGG 
TTCCCAGGCT 
CCTGTGGCAG 
GCGCACGTCG 
GGCGGAGGTC 
GTTGGTGTTC 
AGACCGACCT 
GCATGACTGG 
CCCTTCATTT 
TCAAAAGGGA 
GGGGGTAGAG 
TGAGGTTGGA 
GAAAAGGATC 
ACAAAATATC 
AGAAAGAAAA 
CTGGAGAGAT 
ATCCCAGCAA 
GTGTGTCTGA 
AJ^CTCTGTAT 
TGCTTCTGCC 
GCTGCTTGAG 
TACCCTTCCT 
CGTGTTGCAT 
ATGCAGCTCA 
CATGGTGGTA 
GTGTATGTTT 
CGTGTCATTA 
TCTGCAAGAG 
ATTTGAATAC 



28 

GCGGGCGAGG 
GCTCAAAAGC 
AGGTGGGGCC 
CGGGCCGGGC 
AGGTCCAGCA 
AACGCCAGGA 
AGTCCGAGAT 
TTTTATGCCT 
TCACAGCCAG 
GCAGGGTCTG 
CACAGCCTAG 
GTGCAGCCTA 
TAGCGCAAGT 
TTGTTCAATC 
AGTAGAAATC 
ATCACAATAA 
GGCTCAGTGG 
CCATATGGTG 
AGAGAGCTAC 
TAGAACTTGC 
TCCCAAATGC 
CGTGGCCAAC 
GGTGGCCACA 
TTCAGGGTTG 
GTAGCAGACT 
GCAAGACATT 
GTGTACACAG 
AAGGGGATCA 
CAGTGTGCAG 
AGTTTAAAAT 



GCGGCCGGAG 
GGTCGAAGGA 
GGTACGCGGT 
AGCCCGGAGA 
CCGCGCGTTT 
GGCCGAGCAG 
CGGTCGACCG 
TGTCCACTGC 
GTCTCTTGGG 
GGCTGGGGTC 
AAGGGGAGGC 
GCCAAGAGCT 
GGGTGAGGAA 
TCTGAGTGCA 
AAACCCAAAA 
AAGGAAAAAT 
TTAGGAGCAC 
GCTCACAACC 
AATGTGCTTA 
TATGAGGACC 
TAGGTACCTA 
AGGCGGCCTC 
TGTAATGGTG 
GGCTGCAGGC 
GCCTGCCTAG 
TTGGGAACAG 
GTTTCCTTGG 
TGAGTAGGCC 
ACTTAACCAT 
GAAATGCACA 



GAGCGCGGGA 
GGGCAGTCAA 
GAGCGCGGGA 
GCCACGGGTC 
GCCGCCCAGG 
GCAGCAGGCG 
ATCCACCGTC 
CATGGTGGGG 
TCAAACTGTT 
AAGGTCACCG 
CTAAGCTACA 
GTTTTTTCAT 
GCCCTTACGT 
GCTCAGCCCT 
ACACAACAAA 
CACACTTGCA 
TGACTGCTCT 
ATCTGTAATG 
TATATAATAA 
AGGCTGGCCT 
CACTCCCGTT 
AGCTACAGAG 
GCCCATTTTA 
CTGCATAGGT 
TGTGTGAGAG 
TTTTTGCTTT 
CAACCAGAAG 
TATGTGGGTG 
TAAGCCATCT 
TTAAGCCACG 



S e / 02 G 5 8 

22080 
22140 
22200 
22260 
22320 
22380 
22440 
22500 
22560 
22620 
22680 
22740 
22800 
22860 
22920 
22980 
23040 
23100 
23160 
23220 
23280 
23340 
23400 
23460 
23520 
23580 
23640 
23700 
23760 
23820 



PCT/GB 



9 9/02658 
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GTGACAGTGA TGCAAACTTT TAATCCCAGA ACTCAGGAGG CAGAGGCAGG AGAATGTCTG 2 3880 

TGAGTTCCAG GCCAGCCTGG TCTACAGACC TAGTTCCAGG CCAGCCTGGG CTACACAAAA 2 3 940 

AJUi^CAAAAGC AAAACCAAAA CAAAAT A/LAG ACACAGACTiA ACCATGGCAG GAAGACATGG 2 4 000 

GAGCCTCAAC CTCTTCATTT GACGGCTGAG AAATCGAAAA CAGATGACCA GGAGAGACCA 2 4060 

AGGTCTCACT GCTGCCTTCA AGGCTTGCCC TCAGTGACTG GAAGATGTTC CACTGGGCCG 2 4120 

CATCTTAATA TCTTACCATC TCAGGGCTGG AGAACTGGCT GAGTGGTTGA GTTGCTCTTG 2 4180 

CAGAAGTCCT AGGTTTGATT CCGAGGACCC ACAGGGTGGC TGAGATCACT TATCCCAGTT 24 24 0 

CCAGTGGAGC CAGTACCCAA ATAGTGCATT ACACACTTGC AGGCAGAACG TTCAGACACA 2 4 300 

TAAAATAAAA TAAATAGACC TAAAAACATT TAAAAGAAAG GAGAAGCATT ATCCAGAGTC 24 3 60 

GTTTTATTTT GTTTTGAGAT AGACTCTTAG TTGACCTGGG ACTGTCTGTG TAGACTAGGC 24420 

TGGGCTTGAA CTCACAGCGA TCCCCCTGCC TCCCAAAGTG CTGGGTGTAC CACCGTGCCA 2 44 80 

GGTACCTAGG CCCCTGTTTA AGAAGACACT TGCCATCAGT GGCTGGGTGT GGTCTTAGCT 2 4 540 

GCAGAAAGCC ACCTGGCCCT TCCCAGGTGT CCACATATAA TGGTTGGTCC ACTTTGGTAC 2 4 600 

GAATGCTGGG CACCCCAACT GCATGTCAGC TTTGGGCTTT GGGTTAGCTG AGGTCTGCAT 2 4 660 

ACTGGTTCTA GTTGCCCACC CCTTCTCTTC CATAGAGGTG GGGCCTAAGC CCGTGTTCTA 24720 

AACTCCATCT CAGGCTCTCT TAAGAAGTGA CCTGCGACAT CCAGGAAGAA GTAACAGCCA 2 4 780 

GTGCCCCCGA GACCCACTCA CTACATGCAG TCTCAGCCCC TAGAGAGGAT GGAAAAGCCT 2 4 840 

CCGGTCTCCT TGTTCTTATG ATCAGCCTTC TCCTCAAGGA GCTGGGGCCA GTGGGGCAAA 2 4 900 

GCACATTCTC TTCTGACCCT GAATCACAGA TCCTGAGTCA CTGGTGCAAA CTATC7\AGCG 24960 

CTAAGTTGGT GGTGAGGTTG ACCTGTACTA CAAATCACTT CATTTCTCAC CCAGACTAGC 25020 

TTATTGGCAT TCCAGGCATA GAAAGCCAAG AGCTTGACCC CCACTATAGC CCCAGAGAGA 25080 

CAGCCCACAT AGTCTGTGGG CATAGTGATC TCATCTTAGG TAATCCATGC ACATAAATTA 2514 0 

GCATGTCTTG ATAATACATA. CCTAA.TGCTC CTGTTAGGCC AGCATGCCTA ACATGCTCAC 2 5200 

CAACCCAATC TGTGTTTGGG AAAGGCCAAT ATTCCGCAAG GCAGAATGCT AGTCCTTCAG 252 60 

GAATGGGGCT GCAGCTGGAC TGGGGAGAAC ACACTGAGGT TATAAGAGGA CCATTGAGGC 25320 

CTAATAGCCA AGGTAGAGTA GGCGGAGCCT TGGGTTACAG TGTTCAGCAC CAGGAGGAAA 2 538 0 

GAGTCACTAT CACCATGGGG TTCATCTGTC ACTGGAGGAA GCAGAATATG AACTAAGAGG 2 5440 

CATATTATGT TGGGTTACGA CTTTAGTTAA GATCTGAGTG TATCCCATGT GATACATTGT 25500 

CAGTCCTTAG GAAGATGTCT TGGGAGATGG TGAGATCTTT AAGATAGAGA CCCAGTGTCA 2 5560 

GGTTCTTTAT GTCTCTGACA GCATGCCCAT GAAGGAAGTG GTCTCTCCTG GATCTCTTTT 2 5 620 

TCAGTTTTGC AGGCATGGGA TGAAGGGGTG TATCCTTCCG TGTGCTTCTG CCATGATGTG 2 5 680 



PCT/SB 
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TACTTCAACA 
CTGTGAGTCC 
TGCTGGAAAT 
AAGCCATCAA 
TTATACATTA 
AAGGCACGTA 
TATCACCATG 
AGCCCTGGGT 
CCCCCTGCAC 
TGCxAAATGCT 
TTAGCCATAG 
GATCAGCCAG 
CAGTGGCTGC 
CATGTGTCTG 
TGTAGGGGGT 
TAGGGTAACC 
TGCTATGTAC 
TGGGGGCAGA 
TGGTATGCAG 
TGCAGTTGTA 
TATGAGCCTG 
CCCCTGGAAC 
CTGGTCTCCT 
TCTCTGGACT 
GCATAGGGAA 
TCTATCTATC 
TGTTAGCCTT 
ATCCTGGTGG 
GCTGGGTGTG 
TGAGTTTGGG 



TAGACCTGTA 
AAATAACCCT 
GTGCAGCAGG 
GGGGATAATG 
CAACTTGTCA 
AACATAGCAA 
CTGGGTGCTT 
GAATGTATAA 
GCTGTGGGTG 
GACGGTATGT 
AAGTGGTGGC 
GCCCTCTCCT 
TACCATTCTG 
TCCTTTCTTC 
GGATTAGGAA 
TGTGAGTATC 
AAGTGAGCAC 
GCCCAGGCAG 
CCCTTCCTTA 
TATTATTTGG 
CATGTATGTC 
TGGGAGTTCC 
AG/y^GAGCAA 
GGGAGACAAA 
GACCTAGGTT 
TATCTATCTA 
GGCTGTCCTG 
TCTCTCCCCA 
GCAGTACACA 
GCCAGCCTGG 



AGGAACAGTG 
TTCTTTCTAG 
ATCAGGAGTT 
AGATACTTCC 
GGGGTTTTGC 
CATGTGCTAT 
AGACCAGGGC 
TGTATCACGG 
AGGTGAATGG 
TCTAGTGGAG 
TCCCTGATGA 
GCGCTGTGCA 
CCAGAGAGTG 
ACTCTCTCAC 
ATGCTATGGG 
AAGGAAAGAA 
AAATGTAACC 
GCGTTTCTAC 
TCCGAGACGG 
GCAGCTCACT 
TGTGCAAJ\AT 
GGGTGGTCGT 
CCAGTGCGCT 
GGAAAAGTGA 
CATTCTATGT 
TCTATCTATC 
GAACTCTATG 
CTTTGCCTGA 
CCTTGCTCTC 
TCTATGCAGT 



30 

GCTACAGATT 
GCATGGCGGC 
AAAGACCAGT 
TCAAAAACCA 
TATAGTAATT 
GTTTAAGGCA 
TATGTCGAGG 
GCCTCAGACC 
GGATTCGGCA 
GTGTTTACAA 
ATGTCCACAA 
GAGTGAACAC 
CACAGGCCAC 
CACCCTTGTT 
ATGAGAGGCA 
AGTGTACACG 
TCTGGAAATA 
TCATGGTCCT 
AGCCTGGTGC 
TCTTTAAAAT 
GTCCACAGAG 
GTTTGGCATA 
CAGCTGCTGA 
GAGACTGATT 
CATGTGTCTG 
TATCTATCTA 
TAGACAAGGC 
TTAGGCTCAC 
AGCACTCCGA 
GAGCTATAGG 



GTGGACTGAA 
ACACACCTGT 
CTCAGATAAA 
TCAJVATTAJU\ 
AAAAGTCACC 
ACATGTGCTA 
TCCCGGAGGA 
TGTGAGATCT 
GAGCCTTTGT 
AGGACGGGCC 
CCTGGGATTG 
ACGGAGGTTC 
CTGACCCCAG 
AGGGTCCCAG 
GTGTTGGTTG 
CAGAAGGCTC 
CCCATTTATC 
AGGAGCAGCC 
CGGGACACAG 
ATTTTTGAAA 
GCCAGAAGAA 
TGGGGCCTGG 
GCACCTCTCC 
CTGTTCTGTC 
TCTGTCTGTC 
TCTGAGACAG 
AGGTCTTAAA 
TTTTAAAGGG 
GGCACAGAAA 
CAAGCCAGGG 



GTCTCTGAGA 
AATCTCAGCA 
TGACAGTTCA 
ACTTTTGTTT 
ACAGGAAACA 
GGAAGGTAGA 
GAGCTGAGGA 
GGCAAAGCTT 
CTGGTCTGAG 
AGTGTGCGCT 
CTGCCCACAA 
TGGGCTGCTC 
CCTTTCTGTC 
ATCCAAGTTA 
TCATTCTCCT 
ACCGTGCTGC 
ATGTCTGTTT 
TCTCCTCATC 
GTCATTTCCC 
AAATTATGTG 
GGTGTCAGAC 
AAAATGAACC 
AACTCCTGCT 
AAGTCTCTGA 
TGTCTGTCTA 
GATTTCACTA 
CTCGCAGAAG 
AATGAAATGG 
GGCAGATCTC 
CTACATGGTA 



25740 
25800 
25860 
25920 
25980 
26040 
26100 
26160 
26220 
26280 
26340 
26400 
26460 
26520 
26580 
26640 
26700 
26760 
26820 
26880 
26940 
27000 
27060 
27120 
27180 
27240 
27300 
27360 
27420 
27480 



r 
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GGACCTTGTC TTAAAAAGAG CCCCAAACAA ATAGCTCACT TGCCCAGGTG AGGTCCACCA 27 54 0 

GCATCTCTAC ATTTTGACCG GAAGCTAAGA GGAATCTTTA TTACATCACG CCTGCCACAG 27 600 

TCTCCATCTT TGTTGCAGCT GGAGTGCTCC CACAGGGCTT CCACTGCACG CACTGCACCC 27 660 

GAAGGGGCTT CCACTTCACG CACTTCACCC GAAGGGGCTT CCACTTCACG CACTTCACCC 27720 

GAAGGGGCTT ACACTTGATT CACTTCACCC GAAGGGGCTG ACACTGCTTG CACTGCACCT 277 80 

TAAGGGGCTG ACACTGACCC AATGGCACCC GAAGGGGCTG ACACTGACCG CACTGCACCG 278 40 

AAGGGGCTGA CATTGCACAC GCTGCACCCA AAGGGGCTGA CACTTGCTGC ACTGCACCCC 27 900 

AAGGGGCTGA CACTTGCACG CACTGCACCT ACCAAGGGTG ACACTGCACC TGCTGCACCC 27 9 60 

AAGGGGGCTG ACACTGCATG CACTGCACCT ACCGGGGCTG ATACTGCACC CACTGCACCC 28 02 0 

AGGGGGGCTG ACACTGCACC CACTGCACCC AGGGGGGCTG ACATTGCACA TGCTGCACCC 28080 

AAAGGGGCTG ACACAGCACC CACTGCACCC GAGGGAGCTG ACACTGCACC CACTGCACCT 2814 0 

ACCGGGGCTG ACACTGCACC GCTTGTAATG TACATTACTG tTTTTTTTTT TTCTTTTCTT 28200 

TTTTTCAGAG CTGAGGACCG AACCCAGGGC CTTGCCCTTG CTAGGCAAGT GCTCTACCGC 28 2 60 

TGAGCTAAAT CCCCTACCCC TACATTACTG TTTAGAAACA AATTTATGGT CCTTCTCACA 28320 

TGCTGCAGGA GATTACACAA AGTTGGGGGT TATCAAGAAT GTGGATCACG GTGGATCATT 2 8 380 

TTAGCACTGT CCCCCCCACA GAAAGGGTCA TTTCTAGACA GAAGAAAATA GTTTATATGG 28 4 40 

AACACTTCTG GGCTGGGCAG TGGTAGCACA TGCCTTAAAT CCCAGCGCTT GGGAGGCAGA 28500 

AGCAGGCGGA AGCACGCGGA TGCACGCGGA CGCACGCATA TCTGTGAGGT TTAGGCCAAC 28560 

TCGGTCTATG CAGCAGCTTC CAAGACAGCC AAGGCTGTAT GGAGACCCTG TCTCGGGGTT 2 8 620 

GGTGGGGAAT CTCTTCACCG TCTTGGTCAC TTCTTTATGT GTGAGACACA TAGACGTTTT 28 680 

TCTTCTGAAT ATTTTATTGC TGCTTGTGGC ATTCACAACT TAGGGAAAAA TTGTTAAATG 28740 

CTGCATTCCC AGCACTTGAG CCAGTGAAGT TCAGGCCTCC GCTCGTCTTG TAATGGTATT 28 8 00 

TGCACAGGGG ATGCCTTGGC TGAGTGAGTT CTTCCAGAAA ACTCCTGGGC CCTTAACACC 28 8 60 

TATTTCCAGC ATTTGGAAAT CCGAGGCAGG AGGATTGACA TGAGTTGCAG ACATAGTCAG 28 920 

CTAGAAGTGC AGCATTAAAT CCTATCTTAA AATAATTATT AGAATAATTT AGGGGGAAAA 28 980 

GCCTCTAATA GAGATGGGAG AGTGTGCGCA TGACTGCCCT ACTGTGTGCT TCTAGAAATC 29040 

AATATGAATG GGCCAGAACT AGAGAAAAGG CTGTGAGAGG CTGTACCCTA CTGTGTGCAA 2 9100 

CCCACTTCCC TCCTACTATG TGGGTGCTGG GCATGACACG AGGTTATCAG GCTGGGTGAC 2 9160 

AAGCACCCTT ACCTGTGGGC AGTCTTGCTG GTCCAACCTA TTTGCATTTG AATCCCAGCT 2 9220 

ACTTCAAACC CCATGGGTGC ATATTTACCC ACTTTTGGTT TTGGAAACAG GATCTTAAAA 29280 

TAAACAGGTC TCACTCTGTA ACCCATGCTG GCCTGAATTC AGCATCTTCA GCCTCAGTCT 29340 
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CCCAAGCGCT ACGATTTCCT ATGTGCCATA TGTCACAATA CATGCACTTC AGTTTTGTCA 2S400 

AAAGAAGTGA ACCAGGAATA ACTGGTACCT ACCTATAAGA CTGCTGTGAT GAAGGAGGAC 294 60 

ATTGTGTAAA ACGAAACTCA GGATATAGTA AGTGCTCAA.C ACGTGTTAGA CATGTTGGTC 2S520 

TCCATGAGGG CACAAACCCA GGGCCTCATG CATGCCAAGA ATTGGCCCTA TCACTGAGCT 29580 

ATACAATTAG TCCCTATGAC CTACTGTGAC CTCAGACGCA CACCATGGAT CTGACATTGC 2 9 640 

ATCAAATCAG AT^TGAATTT CTGAAAGACT TGCTCATAGC ATGCCCTCCC ACACCCCCGT 29700 

CCCAGCCCCC CCTCTCACTG GCAAGGACAT CTCACTGTGG TGGTGGCAGG GCCTCTAAAA 2 97 60 

CATCATAGGA TAGCTGAGCA GCAGTGGCAC ATGGCCTCTC AGTCCCAGCA CAGGGGAGGC 2 98 20 

AGTCAGCCTG GTCTATAGTG TAAGCTCCGG GACAGCCAGG GCTACATAGA GAAACCCTGT 2 9880 

CAAACCTACC CTACTTAAAA ACAGAAGTAG AGCAGTAGTT TGGATTACAC TGCTTTGACA 2 9 940 

CTTGGTGGGT AGCATGTGTG CACCTGCCCA GGAGCTATCT GGATTCTCAA ATGGAAGACA 30000 

CAGACACAGA CACAAACACA AACACACACA CACACACACA CACACACACA CACACACACA 300 60 

CACACACACA CACACCAGTT AACTTTTGAC ACGCCATGAC TAGCTCAAAG GCTAGGGACT 30120 

CCCAAACCTT CCCCTGTCAG CAAATGCTCC CCTCTGGTAC TCCTGAGACT AAGCTAAGCC 30180 

TTCCCCTGCT GTCCCAGGCC CAJVCGGAGGA AGTGAGCATG GTCACTTACC TGATTCTTTT 30240 

TTTTCTTTTT TTCGGAGCTG GGGACCAAAC CCAGGGCTTG CGCTTGCTAG GCAAGCGTTC 30300 

TACCATGAGC CAAATCCCCA ACCCCACATA TTCTGATTCT TACATGGCTG ATTGGCTTTC 30360 

TGTCCCTGCA GTTCTTACAT CCTGTCCTTC TTCCCTGAAT CATGAGGACC CTCTCCTCTC 30420 

TCTCTCTCTC TCTCTCTCTC TCTCTCTCTC TCTCTCTCTC TCTCTCCCTT CTCTCTGTGT 30480 

CTCTGTGTCT CTGTCTGTCT GTCTGTCACA CACACACACA CACACACACA CACACACACA 30540 

CACACTAGCC CATGCAAATC TAAGGGCCCC TTCCCGTCTC CCTTTGCCTG ACCATTGGCT 30 600 

CCTGGCATCT TTATTGATCA ATCAAAAACC AATTGGGGAT AAGGACCTAC AGTGTTTGGA 30660 

CATGAAGATT CCTAATTTGG GGGCTGCATT AATTCAAAAC ATTGGAACCA ATTCCCAACA 30720 

ACAACAACAA CAACTU^TAA AGCAAAGAAA AAAGTTTACA ACGCTGCCTT CATTATTTGA 30780 

GAAACAAATG TAAGGAAAAC CATCAGGTAT CTGGACTTTT AAACGGCCGG GATTTAGAGA 30840 

CTCTGGGATG TTTTCTGTGT TGGGGATTGA AGCCAGGGCC CTGGGACATG GTAAGGAAGC 30900 

ACTGTACCAT GAAACTACAC CCCAGAGTCT GATAAGGCTA CTGAATGACA ATTAAAGATT 30 9 60 

CATAATTGCT GAAATTCTGG AAAACTCTAA GCTACCAATT TTGTATATGC TCAACTTGGT 31020 

TTCCTGAAAA CATCTGAGGT TCTTGCACGT AACTTTTCCT CAGAGCAAGT ACAACTAAAT 31080 

TCTGACTTTG TGACAATAAA GATTGTCAGG AAAGGCTTTG TGAAAATGTT CAGTCCCCAG 3114 0 



r 

GAGACGTGCC 
GAACGGAGTG 
GAACATGGGC 
ACCTTATATA 
ACTCCTGAGT 
GCCCCCATTT 
ACTGGCTTTA 
TGGCCTTATA 
TTTTAAATTA 
CAGACCTCAT 
GTCTGGAAGA 
AAGTGCTGGG 
GCTTTTAAAA 
CCTGTGGGCT 
GTGCCTGGAA 
TGCTACCCCT 
TGAAAGCTGT 
CTCTGGGTGT 
A7\AGCAGGCA 
TCTCAAGGTC 
ATCTGTGTAC 
ACAAAACCCA 
GTGCCCACGG 
TGAGGGCTGA 
GCCAGTGTCA 
GCAAGAAAAA 
CCTCACAGAG 
CACGTTGGAT 
GATGAAACAT 
CCTTCAGTCA 
TCTCCCCGAT 



CTCCTGCAGC 
TGGTACTTCT 
TGCTCTCCCA 
GACTGGCTTC 
CCTAGGTTTA 
TGAGATAGGG 
AACTCCTCAT 
TAATGTAGAC 
TGTATTTTAT 
TACAGATGGT 
GCAGTCAGTG 
ATGACTAGCA 
ATTATCTGTT 
GTCCTGTCCT 
ACTAACTGGG 
TTTATTATAA 
GGCCATGGTC 
TGGTCACATG 
TGGTGGTTCA 
ATTTTCAGCT 
TGAGTTCCAG 
CAAGGTTTAA 
AGGTTAGAGC 
ATTCAGCATC 
ACAGGAGCCC 
AACAATTCCA 
TCTGCTGTAA 
ACCCACGTGC 
GAAAATACCC 
GCTGAGCAGT 
GAGCGGTCTC 



CTGTGAATGG 
GTGAGACACT 
GAGGACCTGG 
AAGTTCTCCA 
CAGCTGTGTA 
ATTTAGGTAG 
CCTTTAAGGT 
TAGACTGGCC 
GTATATGAGT 
TGTGAGCCAC 
TTCTTAACCT 
TGTGTCACCC 
GGCATGTCCA 
GGAACTGGAC 
GGTCTGAAAA 
AAGAAAAGAA 
TTCAGGGATG 
GTAGATTGAT 
CTGCCTGCTT 
ACATAGCACC 
GGCAGCTACA 
AAACTCTATC 
ACTGTATCCC 
TGTGTTCAAC 
AGCCAGGCTT 
AGTTTGATCC 
CTGTTTCTGC 
TGGAGAAACT 
CACACCGCTG 
TTGCCCTCGC 
ATGAGGATTC 



33 

CGGCCAGGTC 
GCAGGACTGG 
TTTCAATTCC 
TGTAGCTGAA 
CAGCTATGTT 
CCCAGGCTGG 
ACCACCATGA 
TTTAACTTTA 
ATACTGTAGC 
CATGTGGTTG 
CTGAGCCATC 
TCCTGGCCAC 
CACAGGGTTA 
TTACAGATGG 
AGCGGGAAGA 
GATATTTTAA 
GTTAGGTCCT 
AGGCCCTGGG 
TTAGAGGAAG 
AGTCAAATCT 
TAGTTGAGAC 
ACTTTTAGTT 
CCGGAGTGGT 
ATATTTGTTA 
GGGGTGGGAA 
TTGCCAGACT 
AAATTCGCAG 
GAACAATGAC 
AGCTGACAAA 
TCGGCTGCAG 
AGGTTGGTCT 



ACAAGTCAGC 
ATGGATGGCT 
AAGCCTTGGG 
TATGACATTA 
TCTTCCCTGA 
CCTCACACTG 
ATTTGCTGTA 
AAATTGTGCA 
TGTCTTCAGA 
CTGGGAGCTC 
TCTCCAGCTC 
TTCTGGTGTC 
TATGCATATG 
CTGTGAGCCA 
ACTCACATGA 
CAGCACGTAT 
GCAAAACTGA 
TTCAATCCCC 
AGGCAAGAGG 
TTGAGTCCAA 
CCTACTTAAA 
ATGTTTGTGT 
GAGCGGGCTG 
AAGCACACGA 
AATGCTTTGA 
CTTTGGCCTT 
AGGAACCTGA 
TTTAGGTTTC 
TGTGCCTCTC 
TACCAGCACA 
CGAACTCCCT 



FST/GB 

AGATGCAGTG 
TAGTAGTTT^A 
CCCTGCAAGA 
^iJVCTCCAGAT 
CGACCCCGCA 
ACTAAGTGAG 
TAGCTCTGGC 
CTTTATTTTT. 
CACAGGGCAC 
AACTCAGGAC 
TCTGCTTAAT 
TCCTTTCCAG 
AACGCAGGTG 
CTTGATGTGG 
CTGTGGAGTC 
GAGACACAAG 
AGGAGGTGGG 
ACCTCTGCAT 
ATTGGTAGAA 
GACCAGCCTG 
ATTTCAAACA 
GTAAGTGTTC 
GCATGAGTGC 
AGAGGAAATG 
CTTCTATCTG 
TACCAGGCTT 
GATCTCAGGG 
ATCGTGCCTG 
TCTCTGTAGC 
GGCACCCCAG 
ATGTAGCTGA 
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31200 
31260 
31320 
31380 
31440 
31500 
31560 
31620 
31680 
31740 
31800 
31860 
31920 
31980 
32040 
32100 
32160 
32220 
32280 
32340 
32400 
32460 
32520 
32580 
32640 
32700 
32760 
32820 
32880 
32940 
33000 
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r 

34 

AAATGACCTT GAGATTCTAC CAATCCTCTT GCCTCTTCCT CTAAAAGCAG GCAGTGAACC 330 60 

ACCATGCCTG CTTTATGCAG AGGTGGGGAT TGAACCCAGG GCCTATCAAT CTACCAAGTG 33120 

i\ACAACACCC AGAGCCCACC TCCTTCAGTT TTGTAGGACC TAACCTAGCC CTGAAGGCCA 33180 

TGGCCATGGC TTTCCCCTCA GCACCCACTT ATCATGAAGG GGCAJVGGGTC CAGTTTCTTG 332 4 0 

GTTAAGTATC TACGCTTGTG ACTAGGGAGA TACATCCTGG GCAGGAGTGA AGGGTTACCC 33300 

ATTCAGCAGC AGAGTTCCTA GGTTTACTGT GACT^CAAAG ATCTAGGAAT GGCCTAGGTT 33360 

GTCCTGACAT GATCCCATTA GCCTACCTCA GATATCTGAA TGCAGGGGCT CACTGTGTGT 33420 

CCCAGTCAGG GACAGTATTT ACTACCCTAA AGTGGGTTAC AGCTCTCGGG GGGGGGGGGC 334 80 

TGCGTGCAGG ACGACACCTG CACCTTCACA CTTGCTTCTT C/^TGGAGTA AGAGGCTGCT 33540 

AACATCCCCA AGGTTTCCAT TTCAGCTAGG ATGAGAGTCT GGAGTTCATG TCCCTGGTAT 33 600 

TCAAGTATAT GACACTGAAG AGCi^AAGAGG CAGAGAGCTC ATCCACTAAC AGGCATGCAC 33 660 

TGCACTCATG AACATCTGTG TCTGATCCCC GGTACACATC AAAGCCATGT GCACCGAATT 3372 0 

CCAGCGCCAG GCAAGCATAC GCAGATGCAT TCTCTGAGGC TAGCTGGCTG GGCAGTCTAA 3 3780 

ATACTGAGTG CCAGGTTCCA GTGAAAGGCC TCGTCTCAAA AACCAGAACA AACCAAATCA 3 3840 

AACCAAACGT AAAGCAGACC AAGGTGAGAG AGTCTTGAAA TGACACCCAA GGGTGCTGTC 33 900 

TGGCCACTGC TTACACACAC TGAAGGACAG CAACTGACCG CAAGAAGCGG GTTTAGAGTG 33 960 

GAGTCTACTG TCTGCTGGGT AGTCCAATGA CGCTGTGTCA GGGCAGGGTC CGGTTACAGA 34020 

AATCACTGAC GGGGAAGCCT TCCCAGGAGA AACGGGGCAC CCTTTTTGCT TTCTGGACCT 34080 

TGGACACACC TGACCCTACC CAGCAAAGCC CAGGATTGAG CAAAGCAGAT AACTAACTCC 3414 0 

TGGCTCAGTT AGGTGAACTG GCTTTTGGCT AATAACCTTA AGACCCAAAT AACTGGGACA 34200 

AATTVAJ^CTTA TTCTACAACA AGAAAAAGCA AGCCACATAA CAAAAGGCTT TGCTTTCCAA 34 2 60 

TAGTTTATTT ATAAAJ\GCAG GAAACATTGG GCTCACACTA TTCCAAGAAA CTCAGGAGAC 34 320 

AGCTCTTGGC TTCTAGAGGG GACAGTCCAG TCTGATCTTC TGTGTGATAG GAACTTCCCA 34380 

GAGTTTAATG CTGTCCGGAG TTTGTATGTG GTGTCAGAAC AGGTACTATA CCTGGTGCCA 34 44 0 

AATCTAGAAT GAATGGGGGC TGCTCTGGAC AAGGCTGTGC CACCCTCCTG GGATGCACCA 34500 

ATTTCTACCC CAATATCCAG TCCCTAGGCA AGCACTGTCC TTATTTGGGC CTGAGAGGCC 34 5 60 

TCTGTTCTCA GTTCCTGACC ACCTGCCCTT CTTTGGTGAC CGCCCAGTTA TAACTCTTCG 34 620 

GAGAGTCCAA GGCCTCTTCT ATCCGTGCCT CCAGGTTCTC CCGAGTGATG AAGTTTTTGG 34 680 

CCTCCTCCTA CGGGAAGAGA AAGGTTAGCT ACGTGGGATC TCTCAGAGCT GGCTGCCTGC 34740 

ATAACTTACT GTCTCTAGCC CACCTTAAGA AGGGTGTCAA AAGAGCCCAG GGCCCCTTTG 34800 



^ Pei/SB 9 3 / 0 2 S58 

35 

AGTGTCCCAC CAATAGGCCA GCACAACTGT ACTAGGTGAC CTCAACCATT ATATCCCTCT 34 8 60 

TCCTCGGACA ATTACCCATC AGATCTCTCA GCCACAACCC AAGAACGAAT CGAATCGTGA 34 920 

CAACCTGATT GGGGGCGGGG GGAGGTGCGG GGTGGTGGAC TCTCTCAACC TTGTCACTCT 34 980 

CCCTTACACA ATAAAAACAG CTCCATTCTC ATTTATGGTT TCCCTGACCT TCAAGTTTCA 3 5040 

CACGGCAATA CTAGCCTGTA ACCTTTCATT GACCATTGTC TGGGAGGCAC TCTGTGTGTG 35100 

CAGAGCAGAG AGAGAAGCCT CCTGAAGACT GGTCACTTGA CACCCTGTGC TGCTCTCCCC 35160 

AGAGCCTCCC TCACCGGAAT CTCCAATACC CACATTCCTC ACGACCTCGG CCCACCTGCA 35220 

GTTTGAGAAC TTCTTGTTCT TTCAGTTGCA CCCAAGCCTG CTCCTCCTGG GCCCTCTGGG 35280 

CCTGGACCTC AGCCTGCCGC AGCTCCTGGG CCTGTGCTTC GAGCTGCAAC CTAGCTATCC 35340 

TGCGAGGAAA AAGGAGCCGA GAATGAGCGG GGCCGTGGGC CGCGCACGGG CAGAAGGAGC 35 4 00 

AGGGTTGGGG GCTGGCCCCC GGGCCGCGCG GCCAGGCGGT TCAAAGCCCG CAACATGACA 354 60 

GCTAGCGCCT GCGCCCTGTC CGGAAGTGGT CGAGGAGCAT CACGGGAATC TCCTGTCCCC 35520 

GCCCCCGTAG GCGGAGCTCT TGCTGTCTGG CCCATCGCGT CCCTGGAGTC TCGGGTAGAG 35580 

TGGGAGGGTC GAGGGGCACC TTGGAGACCA CAGTTCGCTG GGCTGTATTG CCCGCCAATG 35640 

ACAGCTACAA CACCAGGCGC TCTTGCCTAC TTGTTCACAA GCTTGGACAC CGCCCCAAAG 35700 

CGTGTGGTTC TCTGGCCCAC CGGTGGCCTG TGTACCCTTC CAGGGCTGAT GCAAGAAATT 357 60 

CTGAGCTTCT AGCTACTTAA GTTAGGCTTA CACTTCCTCT TTGAGTAAGC GTTCTCCTGA 35820 

TGCACTTTAA ACTTTAGGTT TGCAATCGAG CAACAAACCT TTTCCTTGCA TTGCACTGCC 35880 

CTGTTGTGGT CTTGGGCAGC TCTGCAATTA ATTGCCAGAG TTCTCAGTAT CAGTGATCAC 35 94 0 

TTGTTTTCTA CTGGGTGGCC TGTGTGAGGT GACGCTTGTC CTCTTTCACC TGAGCCGGGG 36000 

TCTTCCCACC TTGAGACCAT TTATGAAGCT TTTAAAGTAC TATCTTCCTC TACAATGAAG 36060 

AGAAACAGGA GGCAGTGCAG GGCAGAGAGC ATCTGTCCCA AAGGTAGATG GCTCTCCTCT 36120 

GGGAGAGTTG TTGGTAATCA GAAGCTGACG GTGAGGGCTA GGGAACTGAA GTCTCATCCT 3 618 0 

ACTCATTAGT GTTGTCAACA CCAGATTCTG CAGAGAATCT GCACTTAGGA GGGTCTGAGG 3 6240 

AGCCTGGGCT GACTGCAATA CCTGATATTT CTCAGAAGAG AACAGAGTCT GCTTACATCC 36300 

TTCTCCCAAC TTCCAGGCCC GTAGGAGGCC CAGCCAGCAC CCTCTTCCTC TCATCTCACC 3 63 60 

CTACCCTGGA GTGACACAGG GTTGCTCTGA ACATCAGGGA ACGTGGCACT CCCATCCTTT 3 6420 

CTTGCAACAG GTCTCACTAT ATAGCTCCGG ATGGTCTGTC ACTTACAATG TATATCAGAC 3 64 80 

TCACAAGAGG TCCATCTGCC ATTGCCTCCT AAATGCTGGG GTTAAAGGCA CATACCACCA 3 654 0 

CACCTGTCCT AAACCTTTCT TCTTCGGGGT CATCCTAGAT AACCAGTATC TCATTTCAGA 36600 

TAACTTCAGT GTCTGGGCAA AGAGAATA.TT TCTATGGTGT GGGTCATTCC TAGAGGCTTC 36660 
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CTAACCTTGC 
GGGTTTCATA 
GAGTAGCTAA 
GGAAGTGGAA 
AAAGCTGGGA 
TGGAGGAATG 
TGCCTGCAGG 
CTGGTGTCAA 
ACTTGATGGC 
CTGCATCTGT 
CTTGAAACAG 
ATCCGAAGTT 
CTAGGAACTG 
AATTCTGAAT 
TTCTAGGGAC 
TGGTGCCTAA 
TTTTTTTAAT 
CCAGGAACTG 
GTCCTCTGGA 
TCTTAAAAAA 
ATGACCCACA 
AAGAGGATGA 
CTAGGAATGA 
TGAGATACTG 
GTATGTGCCT 
ATATACATTG 
CATGTGTTCG 
AAGTGGTCAT 
AACAGATCCG 
AACTTCTGAG 



TGGCTCTGAC 
AGTTGTTU^GA 
TGCTACAGAA 
GTAACTTTCC 
7U\GGAGGCTT 
GAAGCCTCCT 
ACCCTATGCA 
ATCCAGTGAT 
CTTATATTTC 
CTCCAGAGTG 
GGAGGTGCT^ 
GATGTTGGCT 
AGGTTTGAAA 
GTGGTTCCCA 
CAAGTAATCT 
GTAAAAAAAT 
GTTCATTGGT 
GAGTTACAGA 
AGAATAATCA 
AATTAAAACA 
TCTTTCATCC 
GTAGACCTGA 
CCCACACCAT 
GTTATCTAGG 
TTAACCCCAG 
TAAAGGGGAG 
TGCACACACA 
TGAGCTTCTC 
CAGGAAATTA 
CCTTTTTGTT 



GTTCTCTCGG 
GATTTAGGCC 
CTAGAAGGCA 
ATTTGTGACC 
TTCTGGGCAG 
CATTTACCAA 
GACGGTCCCA 
ATCATTAAAA 
CAACAAAGCC 
CTAAGATTGT 
GTCATTACTC 
TTTAAAGTCA 
CAAAAAGCAT 
GCATCCACAT 
CTTCTGGCTT 
AATCTTTTAA 
GTTTTGCCTG 
CAGATGTGAG 
GTGCTCTTAC 
GTGGACCTGC 
ACCGTAGGCC 
CCAGCCGAGA 
AGAAATATTC 
ATGAACCCCG 
CATTTAGGAG 
AACTCCCGGA 
CATACACACA 
AACCTCACTG 
TTTGGAATCT 
CCCTATCATG 
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CTGGTCAGGT 
TACGGTGGAT 
GGTTCTCTGC 
TTCCCCACTA 
TTCATGGGTA 
GGGGTAAACT 
GGATTTAATG 
C7\AAAACAAA 
CAGGCTGGCC 
GTGTGTCACC 
AAACCCCTCC 
CCAGACAAGT 
AACATGGTTG 
TGGGCACCTC 
ATGGGTGTCA 
AAGCAGATTT 
CATGTATATC 
CTACCTGTTG 
CCACTGAGCC 
CTTCTAGTTC 
TAAATCTCTT 
GAG AT GAG AG 
TCTTTGCCCA 
AGAGAAGAAA 
GCAATGGCAG 
ATTTGTTCTC 
CACACACTGT 
ACAGCTACAT 
TTTTCTTTTC 
AATGCTGGGA 



CTACTCATCC 
GAAAGATGTG 
CCCCTTCTCT 
GGTGGCGAGA 
GAAGGACAGA 
ATGGATGAGG 
ATCAGGCCAT 
AAAGCCCCAA 
TTGAACTTGA 
ATACCAAGGT 
TCACAATGTT 
GTCCTTCTGC 
GAGAGATGGC 
AGAATGGCCT 
CTCACATGCA 
TAAAAAAAAT 
TGTGTGAGGG 
GTGCTAGGAA 
ATCTCTCCAA 
TGTAGCATTA 
ACACTTATGA 
CCAGCAAGGT 
GACACTGAAG 
GGTTTAGGAC 
ATGGACCTCT 
TGACCTACAC 
AAAAATGCAA 
TATTATATAG 
TCTAACGGGG 
TGGCAGGCGT 



TTCTTTCAGA 
GAGTCATTTT 
GACCTGTTGG 
TAGAATTGTC 
CAGACAGAGT 
TGACTTCAGG 
TCTATTTCCT 
TCAGGGTCTT 
AGCAATATCC 
ACAGTGATCT 
CTATGAGCAA 
TTAGATCTTC 
TCAGTAGTGA 
ATAACTTC7\A 
TGTGTGCATA 
TTCAACGATT 
TGTCAGGTTT 
TTGAACCCAG 
CCCAAATACA 
GCTACTCAAA 
AACCCTCTGA 
TAGGAAGCCT 
TTATCTGAAA 
AGGTGTGGTG 
TGTGAGTCTG 
ATGTGACATG 
AATGGCTACC 
ACTTACTGGG 
GCTGATCTGG 
TTCCACATGA 



36720 
36780 
36840 
36900 
36960 
37020 
37080 
37140 
37200 
37260 
37320 
37380 
37440 
37500 
37560 
37620 
37680 
37740 
37800 
37860 
37920 
37980 
38040 
38100 
38160 
38220 
38280 
38340 
38400 
38460 
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CTCGTTCGAT GTAGTATTGC AGACTGAACG CAGGACTTTC CACACACTAA GCAGGCATTC 38 52 0 

TGTGAACTGT TACGTCTCCA GCCCCATTTC TAAATTCTAA CACCAAAGTG CTAGTTTTGT 38 580 

CCCTTGACCT GGACACTGCA GTGAGTTCAC AGAACTTATA ATCACCCTGT TTAGTGTAGA 38 64 0 

AGCTACCTCA ATCACCATGA CATTTTTCAA AAATGTGTTC ACTTTCCTCT TTAGAGTCCA 38700 

AGCACACCAA GCTTGGCGGA ACAATGATAC AGTCTAACTG GATCTGTTTC AAAATTGCAA 38760 

CTTGACTCTA CATCTAAATA GGTATGTGTT GTGACAAGTT TATTATGTTG TGTGTGTGTG 38820 

TACACATGTG CCACAGGAAG CCAAAGGACA ACTTGCTAGA GTGCATTTTC TTTCCTGGGA 38 8 80 

ATTGGCCTCT GGTTGTCAGG CTTGGTAGCA CGCACTTTGA CCCTCTAAGC CATCTTGATG 38 940 

■GCCCAGAGAG TGAACCACGC TGTTTTCACT TTCCTACTTC TTGGGCTGAA TTCTCAAGTA 3 9000 

CCTGCCCTTG CAGCTTTGCA CCCTTCCTAA CTTCAAAAGG AAACTGACAT GGAGAAGGGT 39060 

GATACTTGAG GATTTCCTGG CTCACTTAGC TCAGGACTCT GGCCTAAGAA CAGGGAACCC 39120 

AGCAGTGTGA ACAGGGGTCC AAGAGAGTTC ATTTGTACTT ACCGGCAAAA CAGTGTGGCA 3 918 0 

GGCTTCACAC AAATACATAC TCGGCACCAG GACAGGGCCA CTCTGGATGG AGGTGGGCTT 3 9240 

AGGTGGGGTA CTGCCCACCC AGGGTTGTCC TCTCTTGTAA GCAGACTCAT GGGGACAGCC 39300 

CAGAAGTGAT CCCACAGTCT CTCTGAAGCT GACAATAGGG GATAATTCTA AGTCCTCATC 3 93 60 

CTGTGCTCAT CCACAGTCCT TTGTCGATCT GGACACTACT ATCATGGGCT GCTGGAAACA 39420 

GGTCTTTGCA GCCCAAGTCT GAGCCACTAG CTCTGCTTTC ACTGCCAGCC ATTAACCTCC 39480 

GGGAGrGGGC GTGGGATAAG AAGAAACATT TATAGAGTCA ACGGCCAATC TGTATTTGGG 3954 0 

CTGAAAACCA TATTAAGGAA GGGCCAAGCC TGGCATAATG GTGACCAGAG CCACTAGGGG 39600 

ACCAACTGCA CCCAGCTTTA GCAAAGTGAC AGGCAGCATG AGGTACCATT ATGTGTGCTG 39660 

GGCATGCGGC TTCAGGATGG CTCTGTGACC TCCTAGAGGT TGTCTTATTG GCAGGCATAG 39720 

GAAACAAAGG CAGAGAATGA ATGCTACAGC CAGAGAGACC CAGATCTGCT AAGTGGATGA 39780 

CTCTTGTACA TATGTGTGTA TGTTGTTTTT GAGGCAGGGT CTCACTGTGT AGCTCTGACT 3 9840 

GTCCTGGAAT TGGATCTGTT GGTCTCAAGT TCAGATCCTA GTGGTTTATT TTTCCTGTGT 39900 

ATGTGTGCTT GTCATGCACA AGCATGTGTT AAGGTGAGTA GATATGTAGG CACATGGAGA 3 99 60 

TCAGAACAAT GGTGTCACTC CAAACCTTTA TAGACCTATA TCCATCTTGA CATTAGGGTT 40020 

ACAGGTGTGT TCAACATAGA TATGGCCAAA ATTT/^TGTG GGTTCTGAAG ATCTAAATAT 40080 

GTCTTGTGCT GGCTAGTTCT ACGTCAACCT GACACAAGCT AGAGTTATCT GAAGGAAGGG 4 014 0 

AACCTTAGTA GAGAAACTGT CTCCATGAGA TCCAGCTGTA TAGCATTTTC TTAATTCTTA 4 0200 

GTTAGAGACT AATGGGGGAG GGCCCAGTCC ATTGTGGGTG ATGCAACCTT AGACAGGTGA 4 02 60 

ACCTGGGTTT TGTAAGAAAG CAGGCTGAGC AAGCCATGAG GAZ^GCAAGCC AGTAAGCAGC 4 0320 
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ACTGACCATG GCCTCTGCAT CAGCTCCTGC CTCCAGGTTC CTGCCCTGTT TGAGTTCTTG 40380 

TCCTGACCTC CTTCCGTGAT GAACAGTGAT ATGGAAGTAT AACCAAATAA ACCCTCTCCT 4 0440 

CCCAAGTTGC TTTGGTCATG GTGTTTCATC ACAGCAACAG AAAGCCCAAC TGAGGCAGGT 4 0500 

TTTCATGCTG TATTVACT^GC TGAACCATCT TACCAGCTCC ATAGTGTTTA TTTTAAAAGA 4 0560 

TGAGTGTGTA ACTTTCCTTT TTTTCCTTTT AAAAATCCAA AGAACCACGT TCCTCAGGAA 40620 

AAGCTCTGGG CCAGTTCTCC TGGTAACTTT GAAGTCTTTT TAAAGGCAGA GTCTATGTTA 40680 

GAC7VAGCTGG CCTCAACCTC ACAGAGATCA CCTCCCTCTG CCCGTTAACT GCCTGGTGAG 4 0740 

CTACAATGTG TTTTTAAAGA TGTCCCTGTT CCCTCTTAAJV CAJVCTCCAAT TTCACCCATG 4 0800 

TGTTCCCATT TGGTAGGACA GGAAGCCATT TGTTCATCAT GAAGCTTCTG CTGATGTCAG 4 08 60 

GACAGGCGCG CGCGCGCACA CACACACACA CACACACAGC AGCTTTAGTC ATTTGTGGTC 4 0 920 

AGCTGGGAAA ATGGGAJ\AAC ACGGTTGGAG CTGAGTTGAA CTGAAGAGTT GGTGGAGACA 4 0 980 

CATGGTGCAA ATCCTGAGCA GTAGCTGAAG GAAJVGGTACA AGTTTGGCAG TAGATTGGCC 4104 0 

AATGAGGTGC AGAGATAAAG CAGAAGGGCT GCCCCGAGAG CTGCAGCATG GTGCGTGGAA 41100 

CCCTTCAGGA GGTAGAAAGG TAGAAAGGGC TGCTTGGACT ACTAGTGTGT AGATTACTGT 41160 

CTTTCAGCAG GTGAAAGACA AGGCTAGAGC CTGTGATTGG ACAGTAGAAA AGGAGGGCGG 41220 

GCTGAGAGTT TGAGAGTCTG GAGGGATAGG AGGAAAGAAG GAAGATGGAG GAAGAGAAGG 41280 

ATGACCCAGA GCTGTGTGGC TTTAAATAGC CACAGGTAGC TATGAATATC ATATAAGGGG 41340 

TGGATTATGA CAGGACAATT TGTCCACTCA AGGTGGGCAG CTTATATCAT ATTAATTGGC 414 00 

TCTGAGTTCT TTGTCTTGGG CATTTTGTGA GCTGAGAATT TACTGATATA AATCTGACTG 414 60 

ATAAATTACA AGCCTCTAGA GTTTTGATTT TACTGGGTTA CAGGGATTTG TGACAGTTAA 41520 

CTGCGAGATG CTACAGCCAG AGAGACACGG ATTCTGCTAA GTAGATGACT CTTGTACATA 41580 

TGTGTGTATG GGGGCTAGCT GTGAAGGCAG TGAAACTGCT GCCAGGGCCA GAGAGTAGTT 41640 

GGCACTACTG TGGGATGGTG CATCCATTTT TTTAAAAATT ATTTAATGCA ACACTAGTGA 41700 

GTCATCCAGT AGGAAATGCT GGGGTCTGGG GAGCTGGGGG TGGAGGAAAG CCACAAGCCC 417 60 

ACGGAGCCCC AGATCCCCCA CCTCTTTGGA GAATAACACT GATATCAGTG ACTCAGACAC 41820 

AATAGATCTT GGGGTTCAGC ACCCAAGCTC CTCTAGTAAG CATGGGTGCA AAAGGTGTGG 418 80 

AATGGAGAGT GAAGGAAGAC TTTTTCATAA GCCTGTCACA AATGAGGAGG AAGCTAAGCT 41940 

TGGGAAATGC AGGCCTTCAG TGGCAGACCA AGTGGAGTCA ATGAAGTAAG GTCTGAGTAG 4 2000 

AAGGGCTCTG GGTGTGCGCT TCAGGCTGGG TGCACACTTC TTTCTGAGGA AATGCTCACT 42060 

TCCACTTTGA CCATTCCCTG ACCCAGGTCA TAGCTGATGT GCCAGAGTGT CATGGGTGAA 4 2120 
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GTGGTCACTT TCGCTCTTTC CACACAACTG 
ACAGGAGTCC CCTGTGGGGT TTGAGCCCTG 
GTGGAAGGGA GCAAAGAGCA GAGGTCATTC 
GATGTTTAAC ATTAAGGGCT GCTCATCAAG 
GGCTACTCCA GTATGCCCTG GTCAAGACTA 
CTAGGTGGAT AGAAGCACAC ACAGAGGATT 
TTGCTGACCC AGTAGGCATA CTCCTCCATT 
ACAGGGCATT CTTCATGGAA AATTCCTGAC 
GGCAGATGAT CCCAGTGACT TATGCTCAGT 
CCCAAAGGTT TCTGGGATTC CTCCTGTACT 
TCTCTGTGTA GTCCTGGCTG CTCTGGAACT 
ATAGAGATCC ACTTGCCTCT GCCGTCCAAG 
CCCAGCCTCT CTAAAATTTT CTTAATTAAT 
GTCCTGGATA TGCTGGAACT CAGTAATGTA 
CACCAACCTC TGCTTCCAAG TGCTGGGATT 
CTGTTTTATT TTCTGTGCAT GGGTGTTTGC 
ACTGGTGCCC ACGGAAGTCA GAGGAGGACA 
GAGGAGGACA CCGGATCCCC TGCAGTGCCC 
TGGATCTGGA TGACTGAGCC ATCACATGGG 
GAACAAGTGC TCTTAATGAT TGAGATGTCT 
CCATGGACAC GTGGCATACA CTGGGCTTCC 
TGCTGGCGGC TCCAACTGAC CTTTCCTTTC 
TGGTTAAGTC CCCCCTTTTC CAAGCAGCCG 
GGACACACCG GGGGACCTGC TGAAGCCTCT 
CTTTGCCCTC TTGTGCTAAG AAGCTACTTG 
AGCCTTGCCA CCTGAAGTGA AGGGCAGCCA 
GAAAATGGTA AAGTGGGCAA CGAGGCTGCT 
GGTAGCAGCA GCGGCCCAGT TCCTTCCCAC 
CCGGCATCAG AGAAGGCAAG AACAGACGCA 
GAGGCAGAGA CAGGCAGATC TCTGCGAGTT 
TAGGTTCGTC TGTGTTACAC AGGGAGAACT 
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TGCTGTGTAA GACACCCTTT CTCTGGTCAT 
ACTTAAAAAG AAAGGATGAG GGCTACTTCT 
CTGCTGGAGG AGATCTGCTA ACAAGCATGT 
TCAGCACTGA CTCCAGCAGA GTCCTGTCGA 
GCCTTGGCAA GGGAGCAGCC TGGGCTGTTG 
TTCTTAGTGG TGTATGTTVAT CACTGAGGTC 
GCTAGACTCA GTCACACAAA GTGTACAAGA 
TGGGTCTTTT AGAGCTCCAG TTCCTAGAGG 
GTAAAGCTGG TCTGCTGTCA CATCTTTGCT 
TTCTTCTATT TTTATTTTCA AGACAGGGCT 
TGCTCTGCAG ACCAGGTTGG CCTCAAATTT 
TACAGGGATT AAAGTTGCAT GCCACCACTG 
TTATTTTTCA AGACAGAGTC TCACTATGTA 
GAACAGGCTG TCCTTGAACA TACAGAGTTC 
GAAGTGTGTG CCACTATGCC CAGCTAAAAC 
CTGCATGTAT GTCTGTGCAT CATTTGCCTG 
CTGGATCCCC TGAGGTGCCC ACGGAAGTCA 
ATGGAAGTCA GAGGAGGACA CCGGATCTCC 
TCTTGGGAAA AGATCCCGGG TTTGCTCTAA 
CTCTATCCCA TGTTTCTTTG TACACAAACA 
TTTTCACACC ACTCTGTCGA ACTTAAATTC 
TATTCCTAAA TTCTCGGCAT GGCTTGGGTC 
GAAGCACTTA TCTCTGA_ATG TGCCTCTGTG 
GAAGAGCAGA GGTGATGTCT GCCTCCCCAT 
TGATGCTGGA GGTGGTGGGG AAAACCCACC 
CGGCCTGTGT CCTAGCCAGT GGGGATTAGT 
TGCTTTCTGA GCTTCCTCCT ATTTTGGGTT 
TGTGGGGATG AGGAGTACGC CCTCAGGATG 
GTGTCGCACG TCTTCAATTA CAGCACTTGG 
CAAGGCCAGT CTGGTCTACA CAGTGAGTTC 
GTCTGAAGAA ACAAACAAAG AGAAAATTAA 
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42180 
42240 
42300 
42360 
42420 
42480 
42540 
42600 
42660 
42720 
42780 
42840 
42900 
42960 
43020 
43080 
43140 
43200 
43260 
43320 
43380 
43440 
43500 
43560 
43620 
43680 
43740 
43800 
43860 
43920 
43980 
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AGTTAGATGT AGTGGCACGG TCATAATCTA AAATGTGGCC TAGCTGTTCT CTGTTCTCTG 4 4 040 

TTTCTCTTTC TTCCTCCCTC CCTCTCTCTT CTTCATTGTC TGTCTGTCTG GTGCTTGTAT 4 4 100 

ATCAAAATGT AAGTTCTAAG ATATGCTTCA GCACCGTGCC TGCCTGCCTG CCGCCATGCT 4 4 160 

CCACCATGAT AGTCATAGAC CCACCCTCTC GAACTGTGAA TCCCAAATTT ACTTTCTTCT 4 4 220 

ATGAGTTGCC CTGGTTATGG TGCCTTATCA CAGCAACAGA GCAGTGAGTA ATATACCCAC 4 4 280 

CCTCAAAGAC T^AGCTGAAAG AGAGACCCAT GTGCTGTGGC ATGCGTGTGC CTACACTTAA 4 4 340 

CACACATAAA TAT^ATACATC TCCTG7\AGAA AATTTT^J^AAG TTATTCTGGA CAGAAACTAG 44 400 

AGAGGCCAGA CTGGCCTCAG CTCAAGCCCA CAGCAGCTCC TCTGTCCTGC TGTCCTTTCC 44460 

TGTAGAGAAA TTCAGTGAGA CCCAAGCTGT CTGTCCTAGG GCTATAAGCT GGGTGGGTGG 4 4 520 

CTGGGATGAC CACACTTGAT AGAAAAGAGG AAAAGGAACT GGGAGTTGCG GCCGCC 4 4 576 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic Primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
QQiKCKQCZCQ KN:^G1\(Z11'NZK GGT 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic Primer" 

(iii) HYPOTHETICAL: NO 



(iv) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
CGAAGAACTC CGCAGGGTCC 20 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic Primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
AAGACCCGCC ACGACCCG 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc "Synthetic Primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
GAATCAGCAC CCTCTCCGCC 
(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS,: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic Primer" 



(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
TGCGGAGTTC TTCGTGCTGA TGGAG 2 5 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 
{ D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic Primer" 

( iii ) HYPOTHETICAL : NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

GGTGCTCGGC GGCGTCCTTC 20 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHAEIACTERISTICS: 
(A) LENGTH: 21 base pairs 
CB) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "Synthetic Primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
GAGTGGCGGA GAGGGTGCTG A "21 
(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

( B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

( D) TOPOLOGY : linear 



43 



(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "Synthetic Primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
GGCCGAGGCT GAGCGGGG 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic Primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
CTGAAGGACG CCGCCGAGCA 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic Primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
CTCCAACGCC TGCCGCTGC 
(2) INFORMATION FOR SEQ ID NO: 28: 
(i) SEQUENCE' CHARACTERISTICS: 
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(A) LENGTH: 21 base pairs 

(B) ' TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic Primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
GCAGGAGGAG CGGGAGCAGG A 21 
(2) INFORMATION FOR SEQ I D NO : 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "Synthetic Primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: 
TCCAGTGCCC CGCAAGCCG 
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'TATA-BOX' 



AGACATAAAAAGGrCGGTC MOUSE 



AGGC ATA AAAAGG 



2CAGGC HUMAN 



::cAGAC COW 



CGGGCTl AAAAGG 



AGGCATAAAZ^GGTCGGTC RAT 



AGGCATAAAAAAGTCGGTC cVO 14 
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Figure O 5'OT-EST PROTEIN OF DIFFERENT SPECIES 

Mouse 

MLRALNRLAQRPGDRPPTPLLLPVRGRKTRHDPPAKSKVGRVQTPPAVDPAEFFVLTERY 
GQYRETVRALRLEFTLDVRRKLHEARAGVLAERKAQQAITEHRELMAWNRDENRRMQELR 
lARLQLEAQAQEVQKAEAQRQRAQEEQAWVQLKEQEVLKLQEEAKNFITRENLEARIEEA 
LDSPKSYNWAVTKEGQWRN 

Rat 

MLRALNRLAARPGGQPPTLLLLPVRGRKTRHDPPAKSKVGRVKMPPAVDPAELFVLTERY 
RQYRETVRALRREFTLEWGKLHEARAGVLAERKAQEAIREHQELMAWNREENRRLQELR 
lARLQLEAQAQELRQAEVQAQRAQEEQAWVQLKEQEVLKLQEEAKNFITRENLEARIEEA 
LDSPKSYNWAVTKEGQWRN 

Human 

MLRALSRLGAGTPCRPRAPLVLPARGRKTRHDPI^SKIERVlSnyiPPAV^ 
QHYRQTVRALRMEFVSEVQRKVHEARAGVLAERKAIjKDAAEHRELMAWN^ 

lARLRQEEREQEQRQALEQARKAEEVQAWAQRKEREVLQLQEEVKNFITRENLEARVEAA 
LDS RKNYIWAI TREGLVVRPQRRDS 



Alignment 

Mouse MLRALNRLAQRPGDRPPTPLLLPVRGRKTRHDPPAKSKVGRVQTPPAVDPAEFFVLTERY 

Rat MLRALNRLAARPGGQPPTLLLLPWGRKTRHDPPAKSKVGRVKMPPAVDPAELFVLTERY 

Human MLRALSRLGAGTPCRPRAPLVLPARGRKTRHDPIJUCSKIERVISMPPAVDPA^ 

Mouse GQYRETVRALRLEFTLDVRRKLHEARAGVLAERKAQQAITEHRELMAWNRDENRRM^ 

Rat RQYRETVRALRREFTLEVRGKLHEARAGVLAERKAQEAIREHQELMAWMREENRRLQ 

Human Qhyrqtvralrmefv^sevqrkvhearagvlaerkalkdaaehrelmavwq;^ 

Mouse iarlqleaqaqevqkaeaqrqraqeeqawvqlkeqevlklqeeaknfitrenlearieea 

Rat iarlqleaqaqelrqaevqaqraqeeqawvqlkeqevlklqeeaknfitrenlearieea 

Human IARLRQEEREQEQRQALEQAREAEEVQAWAQRKEREVLQLQEEVKNFITRENLEARVEAA 

Mouse LDSPKSYNWAVTKEGQWRN 

Rat LDSPKSYNWAVTKEGQWRN 

Hiiman LDSRKNYNWAITREGLWRPQRRDS 



Predict ed deleted form in JP17 
MLRALNRLAARPGGQPPTLLLLPVRGprprsf sapf ssqds 
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20 weeks 

non-transgenic JP17 transgenic 
a 260.33 ± 0.28 243.34 ± 0.1 3 *** 

b 92.67 ± 0.29 1 1 5.1 4 ± 0.24*** 



52 weeks 

non-transgenlc JP1 7 transgenic 
a 273.83 ± 0.28 261 .0 ± 0.45 

b 11 3.83 ± 0.1 0 1 57.83 ± 0.61 *** 
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Corticosterone 
ng/mJ 


168.9 +/- 23.5 


113.9 +/- 20.3 


256.3 +/- 104.1 


349.3 +/- 123.7 


Leptin 
ng/ml 


*24.4 +/- 1.49 


9.51 +/-2.14 


* 14.74+/- 1.38 


4.58 +/- 0.47 


Insulin 
ng/ml 


1,94 +/- 0.89 


2.8 +/- 1.93 


2.51 +/- 0.64 


2.54 +/- 2.32 


Glucose 
mg/dl 


114.7 +/-4.2 


121.0+/- 3.9 


126.3 +/- 3.3 


135.4 +/- 6.7 


Triglyceride 
mg/dl 


*295.6 +/- 28.7 


178.9 +/- 23.5 


224.2 +/- 52.3 


195.5 +/- 34.5 


Cholesterol 
mg/dl 


122.3 +/- 6.4 


129.9 +/- 9.3 


94.9 +/- 5.9 


100.2 +/- 8.0 




TRANSGENIC 


MALENON- 
TRANSGENIC 


FEMALE 
TRANSGENIC 


FEMALE NON- 
TRANSGENIC 
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% dietary fat 

FIG. 13 
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